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ABSTRACT 

Scaling relations among galaxy cluster observables, which will become available in large 
future samples of galaxy clusters, could be used to constrain not only cluster structure, 
but also cosmology. We study the utility of this approach, employing a physically mo- 
tivated parametric model to describe cluster structure, and applying it to the expected 
relation between the Sunyaev-Zel'dovich decrement {S^) and the emission-weighted 
X-ray temperature (Tew)- The slope and normalization of the entropy profile, the 
concentration of the dark matter potential, the pressure at the virial radius, and the 
level of non-thermal pressure support, as well as the mass and redshift-dependence of 
these quantities are described by free parameters. With a suitable choice of fiducial 
parameter values, the cluster model satisfies several existing observational constraints. 
We employ a Fisher matrix approach to estimate the joint errors on cosmological and 
cluster structure parameters from a measurement of Si, vs. T^w in a future survey. We 
find that different cosmological parameters affect the scaling relation differently: pre- 
dominantly through the baryon fraction {flm and fib), the virial over-density (wq and 
Wa for low-z clusters) and the angular diameter distance (wq, Wa for high-z clusters; 
^Ide and h). We find that the cosmology constraints from the scaling relation are 
comparable to those expected from the number counts (dN/dz) of the same clusters. 
The scaling relation approach is relatively insensitive to selection effects and it offers 
a valuable consistency check; combining the information from the scaling relation and 
dN/dz is also useful to break parameter degeneracies and help disentangle cluster 
physics from cosmology. Our work suggests that scaling relations should be a useful 
component in extracting cosmological information from large future cluster surveys. 
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1 INTRODUCTION 

This work is motivated by large upcoming cluster surveys 
that utihze the Sunyaev-Zel'dovich (SZ) effect (Sunyaev & 
Zeldovich 1972) such as AClQ, APEXQ, Planclfl, and SPlO. 
As is well known, the SZ signal is nearly redshift indepen- 
dent, so these surveys are expected to be especially efficient 
in detecting high-redshift clusters. The expected catalogs 
will be sensitive probes of dark energy, and also useful in 
breaking degeneracies in local cluster surveys (for example, 

^ http:/ /www. physics. princeton.edu/act/ 

^ http:/ /bolo. berkeley.edu/apexsz/ 

^ http:/ /www.rssd.esa.int/index.php?project=Planck 

* http://pole.uchicago.edu/ 



the degeneracy between us and Q,m). The planned and on- 
going surveys will cover thousands of square degrees of sky, 
and detect on the order of ~ 10,000 of clusters with masses 
over a few lO^^M©. For example, the SPT survey will cover 
4,000 deg^ of sky in 4 frequency channels (90, 150, 220, 270 
GHz), and Planck aims to cover the whole sky in 9 frequency 
channels. These cluster samples will contain a significant 
amount of cosmological information. 

Importantly, cosmological information can be extracted 
from large galaxy cluster catalogs in several complementary 
ways. For example, the cluster abundance is exponentially 
sensitive to the amplitude of matter density ffuctuations, 
and the X-ray temperature function obtained from local 
cluster samples has been used to constrain Q.rn and erg (e.g.. 
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Henry 2000, Ikebe et al. 2002). The redshift evolution of the 
abundance will be useful in constraining dark energy pa- 
rameters, with statistical errors competitive with those in 
most other methods (e.g., Haiman et al. 2001; Albrecht et 
al. 2006). Small existing samples of tens of X-ray clusters 
out to z ~ 0.5 already provide interesting constraints on the 
dark energy density Q.de and equation of state parameter w 
(Henry 2004; Mantz et al. 2008; Vikhlinin et al. 2008). The 
cluster power spectrum also contains information on cosmol- 
ogy (Hu & Haiman 2003), both through the growth of fluc- 
tuations (Refregier et al. 2002) and through baryon acous- 
tic features (Hu & Haiman 2003; Blake & Glazebrook 2003, 
Seo & Eisenstein 2003, Linder 2003). Combining the number 
counts and the power spectrum provides a cross-check and 
can allow a "self-calibration" to contain systematic errors 
in the mass-observable relation (Majumdar & Mohr 2004; 
Wang et al. 2004). In addition to the above, clusters could 
also be used as "standard rulers" . The measured gas fraction 
/gas, which is derived from the observed X-ray temperature 
and density profiles, depends on the angular diameter dis- 
tance as /gas oc (e.g., Allen et al. 2008). To the extent 
that the gas fraction is predictable ab-initio from numeri- 
cal simulations, this provides a measurement of Da{z). A 
complementary measurement of -Da (2) can be provided by 
combining SZ and X-ray signals, under the assumption that 
clusters are at least statistically spherical (e.g., Bonamente 
et al. 2006). 

The gravitational potential of clusters is dominated by 
dark matter, whose behavior is determined by gravity alone, 
and is therefore robustly predictable (Navarro et al. 1997; 
hereafter NFW). If astrophysical processes in the gas were 
unimportant, the intracluster gas would evolve adiabatically, 
tracing the self-similar dark matter profile, and its global 
properties would obey simple scaling relations (e.g.. Kaiser 
1986). In fact, observed clusters indeed exhibit scaling rela- 
tions that are tight, but which deviate significantly from the 
self-similar expectation. For example, the relation between 
X-ray flux (Lx) and temperature (Tx) is observed to be 
close to Lx oc , significantly steeper than the Lx oc 
power law expected in self-similar, adiabatic models. These 
observations could be explained by preferentially increas- 
ing the specific entropy of the cluster gas in low-mass clus- 
ters. Many variants of such models have been developed, 
based either on heat input from stars or nuclear black holes, 
or preferential elimination of the low-entropy gas by star- 
formation (see, e.g., a review by Voit 2005 and references 
therein). Our present study is motivated by the fact that in 
any such model, the predicted scaling relations will gener- 
ically depend on the background cosmology. Using simple 
toy models, Verde et al (2002; hereafter VHS02) showed that 
the cosmological parameters indeed affect cluster scaling re- 
lations, i.e. relations among temperature, cluster size and 
SZ decrement. In small cluster samples (e.g., Morandi et al. 
2007), the subtle cosmology-dependencies will be masked by 
the larger uncertainties in the physical modeling of cluster 
structure. However, given a sufficiently accurate measure- 
ment of the scaling relations, using thousands of clusters, it 
should become possible to place useful constraints simulta- 
neously on cosmological parameters and the parameters of 
any given specific cluster physics model. 

VHS02 argued that combining SZ and X-ray data will 
be particularly useful, because the SZ and X-ray signatures 



depend on cosmological parameters differently, and singled 
out the relation between the Sunyaev-Zel'dovich decrement 
(S^) and the X-ray temperature (Tx) as a promising probe 
of both cluster structure and cosmology. Afshordi (2008) 
showed that the measured relation between SZ decrement 
and angular half-light radius, which does not require X-ray 
data, may already help reduce the errors in cluster mass esti- 
mates. Younger et al. (2006) showed that combining number 
counts from SZ and X-ray surveys delivers constraints that 
are tighter than adding two independent measurements in 
quadrature; this synergy again arises because the SZ decre- 
ment and X-ray flux depend differently on the background 
cosmology. Finally, Aghanim et al. (2008) recently used hy- 
drodynamical simulations, and studied how different values 
of the dark energy equation of state w affect SZ vs. X-ray 
scaling relations. They found relatively little direct sensi- 
tivity to w (which is consistent with our own findings; see 
discussion in 13. II below). 

Despite the above few works, the utility of the scal- 
ing relations in probing cosmology remains relatively unex- 
plored. We believe it deserves more investigation, for the fol- 
lowing two reasons. First, data on the scaling relations will 
be automatically available (at least for a subset of clusters) 
once the planned SZ surveys are performed. Large catalogs 
of cluster temperatures (hundreds of clusters) already exist, 
and new, much deeper X-ray surveys are being proposed 
and plaimed, such as eROSITA and IXCQ. Compared to the 
number counts, the scaling relation technique should be rel- 
atively less sensitive both to selection effects and to the re- 
lation between the observables and cluster mass. Second, as 
we will discuss below in detail, scaling relations derive cos- 
mological information from a different combination of ge- 
ometrical distances and non-linear growth than the other 
cluster observables. For this reason, they could not only be 
combined with other techniques to tighten constraints, but 
can also serve as useful consistency checks. 

In this paper, we follow VHS02, and we focus on the 
relation between the total SZ fiux decrement, encoded in 
integrated Compton y parameter, and the X-ray emission 
weighted temperature. There are other physical quantities, 
such as the X-ray luminosity or the central SZ decrement yo- 
These quantities are especially sensitive to the properties of 
the cluster core, where cooling, star-formation, and feedback 
processes are most effective, and which is therefore the most 
difficult region of the cluster to model. The scatter in these 
quantities is known to be large, which will limit their util- 
ity for constraining cosmology. In contrast, the integrated 
Compton y parameter and the mean emission-weighted tem- 
perature show strong robustness to the above uncertainties 
(Reid & Spergel 2006 and Kravtsov, Vikhlinin & Nagai 2006 
showed that similarly robust observables can be constructed 
from X-ray data, as well). An additional virtue of these two 
quantities is that they are relatively easy to measure, i.e. 
they do not require a detailed measurement of radial pro- 
fileslfl The main improvements of the present study over the 

^ See |http://www.mpe.mpg.de/erosita/MDD-6.pdf| and 
|http://ixo.gsfc.nasa.gov[ respectively. 

° We note, however, that the cluster core needs to be excised in 
cooling core clusters, in order not to affect the emission-weighted 
ICM temperature measurement. While it is possible to extract 
both the core temperature and the ICM temperature from a single 
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analysis of VHS02 are the following: (i) we include a full 
set of 8 cosmological parameters, representing the matter 
density Qm, the dark energy density iloE, the baryon den- 
sity Qi,, the Hubble constant h [h = _ff/100km s~^Mpc~^), 
two dark energy equation of state parameters wo and Wa 
(see detailed definition in next section), the normalization 
of the matter density power spectrum ag and the "tilt" of 
the primordial power spectrum n^. Note that we do not 
assume spatial flatness, so the 8 parameters are indepen- 
dent. VHS02 only included flm, o"8 and h as free parame- 
ters, (ii) VHS02 adopted a simple spherical toy model for 
cluster structure, based on the virial theorem, to predict 
relations between different observable quantities. This ap- 
proach has the virtue of simplicity, and makes it easier to 
interpret the results; however, such a simple cluster model 
is already in contradiction with existing data. Here we use 
a more elaborate, and more realistic phenomenological clus- 
ter model, with many free parameters. We explicitly require 
the model to satisfy existing observational constraints and 
we explore the impact of various cluster structure uncer- 
tainties on the flnal conclusions, (iii) We employ a Fisher 
matrix technique, instead of a Kolmogorov-Smirnov test as 
in VHS02. The Fisher matrix technique is a fast way of esti- 
mating joint parameter uncertainties in a multi-dimensional 
parameter space, and allows us to understand parameter de- 
generacies, (iv) Finally, we also study constraints from the 
number counts (including the effects of cluster structure un- 
certainty, mass-observable scatter and incompleteness), and 
we forecast the combined constraints from the scaling rela- 
tions and the number counts. 

The rest of this paper is organized as follows. In § [21 we 
describe the Fisher matrix technique and the physically mo- 
tivated, phenomenological cluster model we adopt. The clus- 
ter model is compared against observations and simulations. 
We also explain our choice of fiducial values for cosmologi- 
cal parameters, cluster parameters and survey parameters. 
In § [3l we present our main results, i.e. the constraints on 
cosmological and cluster structure parameters. Proceeding 
pedagogically, we first include only the 8 cosmological pa- 
rameters, then add increasing uncertainties from the clus- 
ter structure parameters to our analysis. We also explain in 
detail where the cosmological constraints from the scaling 
relations come from. In § [H we compare the scaling relation 
technique with constraints from the number counts, and dis- 
cuss various caveats and possible improvements to our re- 
sults. We summarize our results and offer our conclusions in 

m 



2 CLUSTER MODEL AND FISHER MATRIX 
TECHNIQUE 

2.1 Fisher matrix technique 

We employ the Fisher matrix technique to forecast cosmo- 
logical constraints from future surveys. The Fisher matrix 
is a quick way to estimate joint parameter uncertainties in 
a multi-parameter fit (Fisher 1935; Tegmark et al. 1997). It 



spectrum, this will inevitably introduce uncertainties, which will 
be discussed in §|4] below. 



is defined as. 



(1) 



where C is the likelihood for a certain observable, and pi is 
the parameter set (including both cosmological and cluster 
structure parameters in our case). The best attainable co- 
variance matrix C is simply the inverse of the Fisher matrix 
F, 



(2) 



and the constraint on any individual parameter pi , marginal- 
ized over all other parameters, is \/{F^^ii- Another advan- 
tage of the Fisher matrix technique is that it is easy to obtain 
joint constraints from several data sets or methods: the total 
Fisher matrix is just the sum of individual Fisher matrices 
as long as they are uncorrelated. In this paper, we assume 
that the different Fisher matrices are indeed independent; 
we justify this assumption in § U 

The Fisher matrix approach makes the underlying as- 
sumption that the likelihood surface for the parameters is a 
multi-variate Gaussian. This is indeed the case if experimen- 
tal errors are Gaussian-distributed and the model depends 
linearly on the parameters, but in general, this assumption 
does not hold, and is instead justified by invoking the central 
limit theorem in the presence of large number of independent 
data. The classical example is the CMB likelihood which is 
very close to Gaussian for the so-called "normal" or "physi- 
cal" parameters (Kosowsky et al. 2002), but not necessarily 
for the standard cosmological parameters. However, for most 
cosmological models and future CMB data sets, especially 
if combined with external datasets or a weak prior on Hq, 
the CMB likelihood is very close to Gaussian even for the 
standard cosmological parameters (Komatsu et al. 2009). 
For degeneracies in parameters space that are described by 
non-linear parameter combinations, the Fisher matrix ap- 
proach tends to under-estimate the error-bars. Even with 
these limitations, the Fisher matrix approach is invaluable 
to estimate degeneracies among parameters and assess which 
data set combination can lift them. 



2.2 Cluster model 

Galaxy clusters are the largest gravitationally bound struc- 
tures in the universe, and the properties of their dark matter 
halos should be relatively insensitive to astrophysical pro- 
cesses, which typically operate on scales much smaller than 
the size (i.e. virial radius) of a massive cluster. However, pro- 
cesses such as radiative cooling, star formation, heating and 
radiative feedback from active galactic nuclei, turbulence, 
and non-thermal pressure support from energetic particles 
accelerated in large-scale shocks, can all have significant im- 
pact on the thermal state and spatial distribution of gas in 
the intra-cluster medium (ICM), especially near the center 
of the cluster. Many aspects of the ICM remain poorly un- 
derstood, despite extensive theoretical work, numerical sim- 
ulations, and high-resolution observations. 

There have been many approaches to building simplified 
models for cluster structure. Some are purely phenomeno- 
logical formulae for the radial profiles, such as the simple 3- 
parameter "beta-model" (Cavaliere & Fusco-Femiano 1976), 
or the 17-parameter generalized NFW model proposed more 
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recently by Vikhlinin et al. (2006), which provides excellent 
fits to the range of observed X-ray profiles. Many studies 
have based the models on physical ingredients, generally as- 
suming hydrostatic equilibrium, and parameterizing the the 
processes listed above (see, e.g., Komatsu & Seljak 2001; 
Voit et al. 2002; Ostriker et al. 2005; Reid & Spergel 2006; 
Fang & Haiman 2008; Ascasibar & Diego 2008, and refer- 
ences therein). 

In this paper, we do not attempt to build another ab 
initio physical cluster model. Instead, we use a "hybrid" phe- 
nomenological model, with physically motivated free param- 
eters, similar to that proposed in Reid and Spergel (2006) 
and Fang and Haiman (2008). As we will show below, this 
model can satisfy most available observational constraints, 
and has the flexibility to include parameter variations. The 
ICM properties are assumed to be spherically symmetric, 
on average, and are determined by four factors: the radial 
entropy profile, the profile of the gravitational potential, the 
equations of hydrostatic equilibrium, and boundary condi- 
tions. Below, we describe how we incorporate these four fac- 
tors into our cluster model. 

Entropy profile. The radial entropy profile is parame- 
terized by a power law. 



K{x) 



T 



(3) 



where p and T are density and temperature of the ICM 
gas, 7 is the adiabatic index, which we choose to be 5/3, 
appropriate for ideal monatomic gas, and x is the dimen- 
sionless radius, normalized by the virial radius Rvii of the 
cluster. The virial radius R^ir is defined to be the radius 
within which the mean density is equal to the virial density 
Pvir determined from numerical simulations (see equation [7] 
below). K is the dimensionless entropy at the virial radius, 
and s quantifies the logarithmic slope of the entropy with 
radius. Note that convective stability requires the entropy 
if to be a monotonically increasing function of radius (Voit 
et al. 2002), so we require s J5 0. The natural choice for 
the entropy scale is its value estimated using virial theorem, 
i^vir = rvir/(/bpvir)^~\ where Tvir = GMvir/imp/(2_Rvir). In 
above definitions, ft is the baryon fraction of the universe 
(rit/Qm), rnp is the mass of a proton, and /i is the mean 
molecular weight of the ICM (we adopt 0.59 as its value, ap- 
propriate for a fully ionized H-He plasma with helium mass 
fraction equal to 0.25). Note that K^ir is the characteristic 
entropy of a cluster in absence of non-gravitational forces 
rather than entropy at virial radius; in particular, K > 1 
even without any feedback processes. Using the fitting for- 
mula given by Younger & Bryan (2007) and a slightly mod- 
ified version of our current cluster model code, we estimate 
that K is equal to 1.5 in the self-similar case. 

Dark matter halo gravitational potential. We assume 
that the dark matter halos are spherically symmetric, and 
their density profiles are described by the NFW shape. The 
assumption of spherical symmetry can obviously be very in- 
accurate for individual clusters. We assume, however, that 
the main effect of the asymmetries is to introduce a scatter 
in the global scaling relations, rather than to change their 
mean (the accuracy of this assumption should be assessed in 
the future in three-dimensional simulations of a large sam- 



ple of clusters). The NFW profile is expressed as 

pDM r) = , , , , , , (4) 
{r/rs)(l 4- r/rsY 

where r^ is the scale radius and pc is the critical density 
of the universe. For a halo of DM mass A/dm, the two pa- 
rameters Sc. and are determined from the concentration 
parameter c^^^ and the virial density pvir, 



Rviv 



Pvir, DM 



1 / 3Mdm 

CNFW l^47rpvir,DM 
(^NFW-)3 



1/3 



3pc ln(l + CNFW) 



— c 



NFW 



/(! + ' 



(5) 



(6) 



Here pvir.DM = {^om / ^,n) pvir , with Q-DM = i^m — ^h) ■ We 
adopt a fitting formula from the numerical spherical collapse 
model calculation by Kuhlen et al. (2005) for pvir, 

Pvir = \9.-K^VL^{z)pc{z)\\ ae''(z)], (7) 

where Q{z) = ^"^-1, a = 0.432-2.001(|io(z)|°-^^** - 1), and 
6 = 0.929 -0.222(|w(z)|°-^^'' - 1), with w{z) the dark energy 
equation of state. Note that this formula differs slightly from 
other expressions for the virial overdensity that are com- 
monly used in the literature (e.g., Bryan & Norman 1998), 
and it includes an explicit dependence on w. This feature is 
important to us, since we will constrain w, and, as we will 
find below, this dependence drives the constraints on wiz') 
at low redshift. 

It is important to note that Equation (O was obtained 
from spherical collapse calculations that assumed a con- 
stant w. In particular, one may wonder whether this fit- 
ting formula is still accurate when w is redshift dependent. 
To check the goodness of fit of Equation (O, we performed 
numerical calculations of the virial overdensity, using the 
spherical collapse model described by Kuhlen et al. We have 
found that Equation ^ is accurate to within 10% for time- 
varying w models in the range of -1.5< luo <-0.5, and 
— I 0.5t/;o |< vja <| 0.5i(Jo I- More importantly, the virial 
density computed with the numerical method is systemati- 
cally more sensitive to Wa than Equation ([T]) predicts. For 
example, we find that at redshifts Zcoiiapse = 0.1, 0.2, and 0.4, 
the fractional change in Pvir(2coiiapse), when Wa is changed 
from Wa = to uia = 0.21 (i.e., by its Icr value; see below), 
and all other parameters are held fixed, is a factor of 3, 2, 
and 1.4 larger in the numerical calculation than predicted by 
the fitting formula. The higher sensitivity is easy to under- 
stand: Pvir at Zcoiiapso depeuds on w(z > Zcoiiapse), which at 
higher redshifts differs increasingly from the constant value 
uiQ. We therefore conclude that using the Kuhlen et al. for- 
mula makes our constraints on Wa below conservative. 

Equations of hydrostatic equilibrium. Below are the 
equations that we solve to obtain the cluster gas density 
and pressure profiles Pg{r), P{r). The first is from the hy- 
drostatic equilibrium condition, the second is from mass con- 
servation. 



dP{r) . .GMtot{<r) 
= -nPa{r) -f '-, 



dr 



dMg{< r) . 2 t \ 
= 47rr pg(r). 



(8) 
(9) 



where Mtot(< r) is the total mass within radius r, including 
both dark matter and baryons, and Mg{< r) is the total gas 
mass enclosed within radius r. The gas fraction fg, which we 
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will use below, is defined to be Mg{< -Rvir )/ A/tot (< -Rvir)- 
Finally, the parameter 77 ^ 1 is introduced to allow for devi- 
ations from strict hydrostatic equilibrium. Physically, devi- 
ations from rj — 1 could represent any non-thermal pressure 
support (e.g., from cosmic rays and/or turbulence), and also 
lack of full virialization. In fact, allowing for turbulent sup- 
port in the analytical model is known to be necessary in 
order to reproduce the density and temperature profiles for 
the ICM gas in simulations that include non-gravitational 
pre-heating (Younger & Bryan 2007). 

Boundary conditions. The boundary condition at r = 
is specified by requiring that Mg{< r) is zero at the clus- 
ter center. The boundary condition at the virial radius is 
imposed by requiring that the gas pressure matches the mo- 
mentum fiux of the infalling gas. 



P(i?vir) = 



3(r2m — ^b) 



bpDM{Rvir)vff 



(10) 



where vg is the free-fall velocity from the turnaround radius. 
Assuming the turnaround radius is twice the virial radius, as 
in the spherical collapse model, we have Vg = GMv\i / Rvn. 
We follow Reid & Spergel (2006) and introduce a free pa- 
rameter b to allow for an uncertainty in this condition. 

The cluster model described above has 5 free parame- 
ters that capture uncertainties about cluster structure and 
evolution: K, s, c^^^ , r) and h. All of these quantities could 
additionally depend on both mass and redshift (r; and s 
could also explicitly depend on radius, which, however, we 
ignore here). We use a power-law parameterization to allow 
for these dependencies. 



P — Pnorm 



M 
W 



(11) 



where p could represent any of the 5 quantities. In equa- 
tion [11] each function is described by 3 parameters, one for 
normalization, one for mass dependence and another for red- 
shift dependence. We choose M* to be 10^*Mq (this choice 
is not essential, since changes to M* can be compensated by 
changes in the normalization). 

Several cautionary remarks about the above model are 
in order. First, the assumption of power-law mass and z- 
dependence is likely valid only when the variations over the 
observed mass and redshift range are small; the real depen- 
dence could be more complicated, especially if a wide mass 
or redshift range is considered. This could mean that actual 
data will not be fit adequately by such power-laws; in this 
case, additional parameters will likely have to be introduced 
(this possibility is addressed more quantitatively below). 

Second, unlike Reid & Spergel (2006), we did not in- 
clude additional modeling of the cluster cores. The reason 
for this is that core properties are known to vary significantly 
from cluster to cluster, and it is difficult to capture this vari- 
ation with a universal parametrization. Fortunately, neglect- 
ing the core makes relatively little difference in our results. 
The two observables we focus on are temperature T and 
integrated Compton y parameter Y. We checked explicitly 
that introducing a flat entropy core within 0.1 _Rvir changes 
the value of Y by less than 2 percent (this result is consistent 
with Reid & Spergel 2006). When computing the emission- 
weighted temperature T, we excise the innermost regions 
(see eq. [17] below), which makes our temperature-observable 
similarly robust to core properties. This cut mimics the com- 



mon procedures in existing observations, in which the am- 
bient gas temperature is inferred either by excising the core 
region, or using a model (such as a cooling flow model) 
to eliminate the contribution from core regions. In order 
to minimize potential biases from the model-dependence of 
such cuts, we use a simple definition below. 

Third, in reality, the cluster structure parameters will 
clearly have cluster-to-cluster variations: each of the param- 
eters appearing in Equation (|ll|l should therefore represent 
only a mean value. A scatter in any cluster structure pa- 
rameter will induce a scatter in the value of the observables 
we predict. Below, we will derive constraints only from the 
mean observables (i.e. our signal is the mean Y — T scaling 
relation; the finite distribution of Y at fixed T is considered 
conservatively to be pure noise). An underlying scatter in 
a structure parameter can therefore have two effects on our 
results. First, the measurement error of the mean (Y) is in- 
creased, which will correspondingly weaken our statistical 
constraints. Second, the mean inferred value of (Y) can be 
biased, if the scatter in a parameter pi introduces a skewed 
y-distribution - and/or if the scatter is large, and it intro- 
duces Malmquist bias (i.e. the low-F tail of the clusters at 
fixed T could be preferentially missing from the sample). 
The first of these effect will be addressed below by allowing 
for a scatter in Y itself; the possible biases from the second 
effect are discussed in § 13.31 

In summary, our adopted baseline cluster model has 
5x3 parameters; given these parameters, we can numerically 
solve for the density and pressure profiles, and deduce all 
other ICM quantities. Below in § 12.51 we will use existing 
observational data and simulation results to determine the 
fiducial values of these parameters. 



2.3 Fisher matrix for the scaling relation 

Our first way of constraining cosmological parameters is to 
use the relation between the SZ flux 5*1, (measured through 
the integrated Compton y parameter, Y) and the X-ray 
emission weighted temperature Tew ■ The electrons in the hot 
ICM gas scatter the cosmic microwave background (CMB) 
photons, which distorts the CMB spectrum by the amount 
(e.g., Birkinshaw 1999; Carlstrom et al. 2002) 



S,^ — jvY, 

where is a known function of frequency. 



(fcBTcA 



(ftc)2 (e- - 1)2 



tanh(x/2) 



(12) 



(13) 



Here x = hu/{kBTcMB), Tcmb is CMB temperature, h is 
Planck constant, fce is the Boltzmann constant, and c is the 
speed of light, ji, is positive at high frequency, negative at 
low frequency, and has a null at f ^ 220 GHz. Physically, 
this means low-energy photons are Compton scattered by 
the hot electrons to higher energies, reducing the flux at low 
frequency and increasing it at high frequency. 

The ICM properties are all encoded in Y. The total 
distortion within a fixed solid angle is given by 



y(sc e) = 2Tv y(e')e'de', 



(14) 



where y(6) is the Compton parameter along a given line of 
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sight, 

y{e)-- 



P^dl 



dL 



(15) 



and o"T is the Thompson cross section. In this work, we 
use the value of Y integrated over the whole cluster, i.e. in 
equation (I14|) we set 6 = Rvii/DA, with Da the angular 
diameter distance. Combining equations (|14|l and psp . we 
get. 



Y 



UeTdVcl 



(16) 



where Vdustcr is the volume of cluster. Equation (|16|l clearly 
shows that the integrated SZ flux directly probes the thermal 
energy of ICM. 

The emission-weighted temperature is calculated as 



T — 

cw — 



(17) 



where A(i^, T) is the cooling function, calculated by a 
Raymond-Smith code (Raymond & Smith 1977) with metal- 
licity Z — O.SZq, and -R500 is the radius with a mean en- 
closed overdensity of 500 relative to the critical density. Note 
that we do not integrate over the whole cluster - instead, 
we excise the inner region. This mimics the temperature 
measurements in X-ray observations, which either excise the 
cooling flow regions, or model and subtract their contribu- 
tion. Since the inner radius (=0.157?5oo) has to be estimated 
from the data itself, this can introduce uncertainties or bi- 
ases in the inferred Tew. As we show in §|4] below, in order 
not to degrade the la constraints we obtain below, the inner 
radius has to be accurate (statistically) to within « 5%, or 
the mass of the cluster to within ~ 15%. 

Given a Y{T) relation, we can construct the scaling- 
relation Fisher matrix for an individual cluster as 



sr. single 



dY 



dY 



(18) 



where pi and pj are parameters to be constrained, aY,T is 
the total statistical uncertainty on the value of Y, including 
both the intrinsic scatter at in Y at fixed T, and the mea- 
surement uncertainty in Y, am, ay^r = + o"m- Note that 
the partial derivative is taken at a flxed temperature, not at 
given cluster mass, since we are studying the relation of Y 
vs. Tew not Y vs. M. 

For a sample of clusters for which Y and T are both 
measured, the total Fisher matrix is the sum of the individ- 
ual single-cluster Fisher matrices. We approximate this sum 
by an integration. 



jpsr.total ^ 



dz 



d^V 
dzdO, 



dM^{M,z)F!r'^^'^ 



where Afl, Zmin and Zmax are the solid angle, and the 
minimum and maximum redshifts covered by the survey, 
Mm.in{z) is the mass of smallest detectable cluster at each 
redshift, d^V/dzdQ, is the comoving volume element, and 
dn/dM(M,z) is the halo mass function. The form of this 
Fisher matrix is similar to the "follow-up" Fisher matrix 
used in Majumdar & Mohr (2003), except that their ob- 
servable is the cluster mass itself (or a mass-like quantity), 
whereas our observable here is Y . We used the fitting for- 
mula by Jenkins et al. (2001) for the mass function (their 



smoothed mass function, equation 9). In the fitting formula 
in Jenkins et al. (2001), the cluster mass M is defined to be 
the mass enclosed within a spherical region with overdensity 
of 180 with respect to the mean background matter density, 
whereas we defined clusters based on their virial overdensity 
with respect to the critical density (eq. [7] above) . We used 
the NFW profile to convert the Jenkins et al. mass function 
to be consistent with our mass definition. 



2.4 Fisher matrix for number counts 

Another way of constraining cosmological parameters is 
through the cluster abundance. The observable in this case 
is the number of clusters in a given range of redshift and Y , 



d^V 
dzdO, 



(^0 



dn{M, Zt) 

Im 



dM (20) 



where Mmin is a minimum mass we impose by hand (repre- 
senting a sharp survey selection threshold; see discussion in 
13.31 below for allowing uncertainties in the selection), and 
g{Ya, M) is the probability that a cluster with mass M has a 
value of Y within the range of the ath F-bin. In this paper, 
for simplicity, we assume Gaussian scatter between M and 
Y , so that g{Ya, M) has an analytical form. Suppose bin Ya 
is specified by its minimum Fmin and maximum Vinax , and 
for a given mass M, Y has a mean Y{M) and r.m.s. of aY,M. 
In this case, 



/; 

1 

2 



erf 



aY,M 
Y 

^ ma: 



exp 



Y{M) 



[Y ~ Y{M)f 



aY,M 



erf 



Y^ 



dY[2V) 
Y{M) 



aY,M 



V2 



Assuming Poisson errors dominate in the number counts 
(crjv = y/N), and summing over all redshift- and Y-bins, 
the total Fisher matrix for the cluster abundance (Holder et 
al. 2001) is given by 



Tpnc. total 



EE 

a=l /3=1 



1 ON, 



Va/3 



Nap dpi dpj 



(22) 



where and Ny are the number of redshift bins and Y 
bins, respectively. This expression ignores sample variance, 
whose effect on the cluster abundance constraints has been 
considered in detail in previous works (Hu & Kravtsov 2003; 
Lima & Hu 2004; Fang & Haiman 2006), and has been found 
to be modest, especially if the survey is sub-divided into 
many angular cells, and the variance is considered as signal, 
rather than noise; Lima & Hu 2004). Likewise, Holder et al. 
,^12001) explored the validity of the Fisher matrix approach 
Tor forecasting cluster count constraints, and found it to be 
a good approximation, with the exception of the constraints 
for as. 



2.5 Fiducial parameter values 

Here we summarize all the parameters in this work, and ex- 
plain our choice of their fiducial values. Overall, the model 
parameters can be grouped into three categories: cosmolog- 
ical parameters, cluster model parameters and survey pa- 
rameters. 
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Cosmological parameters. We include the following 8 
standard cosmological parameters, with the 3-year results 
from the Wilkinson Microwave Anisotropy Probe (WMAP) 
experiment as their fiducial values (Spergel et al. 2007): 
= 0.244, Qde = 0.756, Q.^ = 0.0413, h = 0.72, wq = -1, 
Wa = 0, (Tg = 0.76 and = 0.96. Here wo and Wa 
parametrize the dark energy equation of state. 



•][z) = wo + Wa{l ~ a) — WO + 



(23) 



We do not assume a spatially flat universe, so the 8 cosmo- 
logical parameters are independent. The more recent 5-year 
results from WMAP are consistent with the 3-year results, 
erg is slightly higher (o-g = 0.817 from the combination of the 
WMAP result with baryon acoustic oscillations and super- 
novae; Dunkley et al. 2008). Adopting this new value would 
increase the number of detectable clusters, and tighten our 
constraints below. We emphasize that the number counts 
constrain all 8 parameters directly, while the scaling rela- 
tion alone can only constrain 6 of them (ag and do not 
affect scaling relation). Nevertheless, when combined with 
the number counts, the information from the scaling rela- 
tion can indirectly help constrain erg and Us, by breaking 
degeneracies, as we will demonstrate in §[3] below. 

Cluster model parameters. The cluster model described 
in § l2.2l above has 5 x 3 = 15 parameters, describing the nor- 
malization K and logarithmic slope s of the gas entropy pro- 
file, the concentration c^^^ parameter for the dark matter 
halo profile, any contributions from non-thermal pressure 77, 
and the gas pressure at the virial radius b, each with a nor- 
malization, redshift dependence and mass dependence. We 
set their fiducial values using results from simulations and 
observations. 

The self-similar collapse model that invokes only grav- 
ity and shock heating predicts a universal entropy profile 
K{r) oc r^-^ (Tozzi & Norman 2001; Borgani et al. 2001), 
which is in general agreement with observations outside the 
core (Ponman et al. 2003; Pratt et al. 2006). We therefore 
adopt this fiducial value for s, with no dependence on mass 

or redshift (Sm = Sz = 0, Snorm = 1.1). 

The difference in the cluster mass inferred from weak 
lensing and X-ray measurements suggests that non-thermal 
pressure contributes about 10% to total gas pressure (Zhang 
et al. 2008), similar amounts have also been seen in simu- 
lations (Rasia et al. 2004; Kay et al. 2004; Faltenbacher et 
al. 2005), thus we adopt r; = 0.9 with no mass and redshift 
dependence. 

For the concentration parameter (^^^ , we adopt the 
fiducial value that is directly computed from cosmological 
simulations in Voit et al. (2003), 



C {M,z) 



8.5 



M 



lOlSft-lMr: 



(1 + ^)' 



(24) 



Note that unlike our other model parameters, in principle, 
^NFW accurately computed ab-initio, using three- 

dimensional N-body or hydro simulations. However, the un- 
certainties are still significant, and the current results are in 
tension with X-ray observations (e.g., Duffy et al. 2008); 
for completeness, we therefore include it as a free param- 
eter in our baseline model. Below, we will investigate the 
benefits of placing tight priors on this parameter (and we 
find the benefits to be small). For simplicity, we set 6=1, 
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Figure 1. The L — T relation observed at low redshift, together 
with the predictions in our cluster model. The data points are 
from the HIFLUGCS cluster sample, with average redshift z = 
0.05 (Reiprich & Bohringer 2002). The thick solid [red] curve 
shows the L — T relation at this redshift, predicted in our fiducial 
cluster model with isTnorm = 2.4, 7?ni = —0.12 and K^. = 0.0. For 
comparison, we also show L — T curve for Kuoira = 1 (dashed 
[blue] curve, J^m and K^. are kept at their fiducial values) and for 
= 0.0 (dotted [thin red] curve, /i'liorm and K^, are kept at 
their fiducial values). Decreasing iCnorm increases the luminosity 
at a given temperature, whereas changing /^m changes the slope. 



which corresponds to the condition that all kinetic energy is 
transformed into thermal energy at the virial radius. Molnar 
et al. (2008) recently studied in detail the morphology and 
properties of virial shocks around galaxy clusters in smooth 
particle hydrodynamics (SPH) and adaptive mesh refine- 
ment (AMR) simulations of a sample of individual clusters. 
Although virial shocks are often preceded by external shocks 
farther out, closer to 2 — 37?vir, a significant fraction of the 
clusters' surface area is covered by strong virial shocks lo- 
cated at ~ -Rvir, for which b = 1 should be a good approxi- 
mation. While detailed simulations could help refine the best 
fiducial choice for the mean boundary pressure, we do not 
expect the choice of this fiducial value to have a significant 
impact on our forecasts. 

The fiducial value of K is fixed by fitting the L — T rela- 
tion. In Figure[T] we compare the observed L — T relation to 
the predicted values for the best-fit K. The data points are 
from the HIFLUGCS cluster sample (Reiprich & Bohringer 
2002). The bolometric luminosity is computed from 



L ■ 



dV / dune{r)nH{r)A(T,u) 



(25) 



where rie and tih are electron and proton number densi- 
ties, respectively, and we assume a helium mass fraction of 
25%. Unlike y, L is sensitive to the core properties, and 
the HIFLUGCS sample includes both cooling-core and non- 
cooling-core clusters. To account for this mixing, we modify 
the entropy profile in our model clusters, and add a fiat en- 
tropy core of size 0.1-Rvir. The best-fit value of K is found, 
by minimizing using the data from Reiprich & Bohringer, 
to be 2.4(M/M*)-°-^2(l + zf. 

This is in accordance with previous results on cluster 
formation and evolution. In particular, we find that the en- 
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Figure 2. The L — T relation observed at iiigli redshift (z 0.8), 
and predictions in our cluster model. The data-points are from 
the high-redshift WARPS clusters with average redshift (z) = 
0.8. The thick solid [red] curve is the prediction in our fiducial 
cluster model with -S'norm = 2.4, K^^ = —0.12 and = 0.0. 



tropy is elevated (A'norm > 1.5), as expected by feedback 
processes, and also that the increase in entropy is more sig- 
nificant for lower-mass clusters, breaking the self similar re- 
lation. For comparison, in the same figure we also plot the 
L — T curves expected for i^norm = 1.0 (dashed line. Km 
and Kt. are kept at their fiducial values) and Km = (dot- 
ted line, K-norm and K^ are kept at their fiducial values). 
Lowering Ji'norm raises L at a given temperature, while low- 
ering Kyn increases the slope. 

We emphasize that K = 1, as mentioned above, does 
not correspond to the gravitational-heating only case, and 
that, in agreement with previous results, the most massive 
clusters in Figure [1] fall on the observed L — T relation even 
without any non-gravitational heating. To verify this, we fol- 
lowed Fang & Haiman (2008), and used the fitting formula 
given in Younger & Bryan (2007) to compute the maxi- 
mum entropy in the gravitational-heating only case. This 
entropy is found to be « l.SA'vir at the virial radius. With 
our adopted set of cosmological parameters, a lOkeV cluster 
has mass of 3 X lO^'^ Mq, and K = 1.6; the difference be- 
tween our fiducial K and that of the simulation is only 0.1 
around T = 10 keV (this difference is due to the fact that 
our fiducial entropy profile is steeper than found in adiabatic 
simulations) . 

Though a successful fit in general, at the lowest tem- 
peratures shown in Figure [2] (T ;^lkeV), the best-fit L — T 
relation has a slope that flattens slightly, in contrast with 
observations that indicate a steepening at these temper- 
atures (e.g., Helsdon & Ponman 2000). This shows that 
our particular power-law parameterization (eq. Ilip is in- 
sufficient in capturing the increase in the mean entropy 
at low temperatures. Correcting this deficiency would be 
possible by using different parameterizations (e.g., param- 
eterizations that preferentially affect the cores of low- 
temperature clusters, such as the "entropy-floor" models; 
e.g.. Fang & Haiman 2008). However, in this paper, we fo- 
cus on clusters detectable in SZ experiments. These have a 
mass J>;lO^''/i~^M0 (or equivalently, a mean temperature of 



T ^l.SfceV^), and the power-law models provide a good fit 
the L — T relation of these clusters (Fig. [2]). 

We also note that we do not find a need for K, as ex- 
pressed in the form of equation (lllf) . to evolve with redshift. 
In Figure [2] we show the L — T relation predicted in our 
model, assuming K^ = 0, at redshift z = 0.8, together with 
data from the high-redshift cluster sample from the Wide 
Angle ROSAT Pointed Survey (WARPS; with average red- 
shift of (z) = 0.8). As the figure shows, the model provides 
an excellent match to the data. Although not immediately 
obvious, this conclusion is qualitatively in agreement with 
the result in Fang & Haiman (2008) , who found that that if 
a fixed entropy floor is assumed to exist in all cluster at a 
given redshift, then this floor value has to decrease toward 
higher redshifts. The entropy floor is the difference between 
total entropy and baseline value (without non-gravitational 
heating). Our result indicates that the ratio of total entropy 
to baseline entropy is the same for low and high redshifts. 
This means that the difference is smaller for high-redshift 
clusters, since they have a higher density and a smaller base- 
line entropy. (For more details on this apparent coincidence, 
see Figure 6 and the related discussion in Fang & Haiman 
2008). 

Our parameterization allows us to vary the slope of 
the entropy profile s, which is necessary to fit temperature 
profiles measured in X-ray observations. This, in fact, is 
the main advantage of our parameterization compared to 
similar models that include a constant entropy floor, since 
the latter approach generically fails to match radial pro- 
flies (e.g.. Younger & Bryan 2007 and references therein). 
In Figure [3l we show the temperature proflle in our flducial 
cluster model with s = 1.1, compared to the mean proflle 
recently inferred from XMM-Newton observations (Lecca- 
rdi & Molendi 2008). In this plot, we adopt z = 0.2, which 
is approximately the median redshift of the XMM-Newton 
cluster sample, and M = IO^^Mq, which corresponds to a 
mean temperature about 6 keV, approximately the temper- 
ature of a typical cluster in the sample. We follow Leccardi 
& Molendi (2008) to compute J?i8o as 

/ T \^''^ 

7?i8o = 1780 f ^j^j /i(z)-'kpc. (26) 

As shown in Figure O the model temperature proflle is 
in good agreement with observations out to the radius of 
0.67?i8o where data is available. 

Finally, we compare the Y — M relation in our flducial 
model with predictions from simulations. More speciflcally, 
in Figure 14] we show the Y2mL>\ — M200 relation in our 
flducial model, together with the predictions for the same 
quantity in Nagai (2006) and Sehgal et al. (2007), who re- 
spectively use high-resolution hydrodynamical simulations, 
and N-body simulations of dark matter halos and a prescrip- 
tion for the corresponding gas distribution. Here F200 and 
M200 are the SZ Compton y parameter and the total mass 
within the radius i?200. We note that Sehgal et al. integrate 
over a cylindrical region extending to an angular radius cor- 
responding to i?200, while Nagai integrates over a sphere of 
radius J?200. For a fair comparison, we compute ^200 both 
ways. The upper solid [red] curve in Figure |4] is ^200 in a 
cylindrical region, while the lower solid [red] curve is that in 
a sphere. The redshift is set to be z = 0, to match Nagai's 
simulation. We use the fltting parameters in row 2 of Table 
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Figure 3. Radial temperature profile in our fiducial phenomeno- 
logical model, in which the entropy profile has a logarithmic slope 
of s = 1.1, compared to recent XMM-Newton measurements of 
Leccardi & Molendi (2008). We adopt a redshift of z = 0.2, ap- 
proximately the median redshift of the observed cluster sample, 
and a mass of M = 10^^ Mq, which corresponds to a tempera- 
ture of approximately 6 keV, close to the temperature of a typical 
cluster in the sample. 
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Figure 4. Scaling relation of 5^200^^^ ~ A'/200 of our phenomeno- 
logical model (Solid line) and two simulations-Sehgal et al.(2007, 
dashed line) and Nagai (2006, dotted line). The upper solid line 
is y200 over a cylindrical region in the same way as computed in 
Sehgal et al. (2007), while the lower line is that over a sphere in 
the same way as computed in Nagai 2006. The redshift is set to 
be here. 



2 in Sehgal et al. (2007), since we focus on clusters above 
the SZ detection threshold (about 3 x 10^'*Mq at intermedi- 
ate redshifts) . Sehgal et al. find a slightly steeper slope than 
Nagai, which they attribute to the effect of AGN heating. 
The slope of Nagai roughly agrees with the self-similar ex- 
pectation. Our slope is closer to that of Sehgal et al., which 
includes star formation, and AGN feedback, but no cooling. 
Overall, the slopes, however, are close to one another, and 
our normalization falls between the predictions of the two 
simulations. 

In addition to the 15 parameters fixed above, we also 



allow for independent scatter in the Y — T and the F — M 
relations. We chose a fiducial value of 10% for both (i.e. ai 
in eg. 1181 and ay.M in eq. [2I]are both O.iy), motivated by 
the simulations of Nagai (2006), who finds an r.m.s. scatter 
between Y and M of 10-15%. The effect of a non-zero Y-M 
scatter on our results is two-fold. Scatter increases the num- 
ber of detected cluster at a given ffux threshold (because of 
the steep slope of the mass function, more clusters scat- 
ter from below the threshold to above it than vice- versa), 
which is helpful in constraining cosmology. On the other 
hand, scatter flattens the effective mass function, which de- 
grades the information derivable from the shape of the mass 
function (as will be demonstrated in § |3] below, we find that 
the second effect dominates in the constraints from the clus- 
ter counts). 

In summary, we conclude that our fiducial cluster struc- 
ture model matches existing observations and simulations 
reasonably well, at least at low redshifts, and for the clusters 
above the expected detection threshold of future SZ surveys. 
This gives us confidence to use our adopted model parame- 
terization to forecast cosmological constraints, and to study 
the effect of cluster structure uncertainties. Of course, in 
the future, as better data becomes available, it is possible 
(indeed likely) that modifying the parameterization of the 
cluster structure model will become necessary. 

Survey parameters. Survey parameters include the fol- 
lowing: sky coverage Afi, frequency /, measurement noise 
am, redshift range Zmin — Zmax and related parameters, such 
as redshift bin size and Y bin size. We adopt typical values 
relevant to upcoming SZ surveys for these parameters. 

The sky coverage AO, is set to be 4,000 deg^ which is 
the solid angle covered by SPT in 2 years. The frequency / 
is chosen to be 14:5GHz, where the SZ signal reaches maxi- 
mum decrement. Most surveys will observe in bands at mul- 
tiple frequencies, in order to separate the SZ signal from 
other CMB secondary anisotropies and from foregrounds. 
However, almost all planned surveys have a frequency band 
around 14:5GHz. The detector noise a,n is set to be ImJy, 
which represents a typical value for the total SZ ffux of 
the smallest detectable cluster in upcoming surveys (such 
as in SPT). At frequency of 145GHz, this corresponds to 
am. = 9.36 X 10"^"^ sr. We set the cluster detection threshold 
to be 5am. At low redshift, surface brightness becomes an 
important additional detection criterion, and a simple fiux 
limit becomes inadequate, so we impose a fioor on the mass 
limit Mmin = lO"/i~^M0 (eq.[20l). We will discuss how this 
choice affect our result in § [l] 

We further assume the SZ survey covers the redshift 
range 0.0 < z < 2.0, and we divide this range into 40 uni- 
form bins {Az — 0.05). This of course requires the cluster 
redshift measurement uncertainty is better than 0.05. This 
accuracy should be achievable by follow-up surveys designed 
for this purpose; for example, the Dark Energy Survey can 
determine cluster photometric redshifts to an accuracy of 
0.02 or better out to 2 ~ 1.3 (Abbott et al. 2005). 

Finally, we divide the Y range into 8 bins, which we 
allocate so that each bin contains a similar number of clus- 
ters. This requirement led us to adopt the following bound- 
aries of the y-bins, in units of the 5am detection threshold: 
[l-2i/*], [21/4-21/2], [2^/^-2^^% [23/4-2i-»], [2'-° -2^^^], 

[25/4 _ 2^/2], [23/2 _ 27/2], [2^/4 _^], 

In Figure [5] we show the minimum mass detectable by 
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Figure 5. The mass of the smallest detectable cluster as func- 
tion of redshift (neglecting scatter), corresponding to a constant 
SZ flux of 5 mjy. The flat part at low redshift is an additional 
constant mass limit imposed by hand, IO^^/i^^Mq. 

the SZ survey as function of redshift, neglecting scatter be- 
tween Y and M. The flat portion of the curve at low redshift 
is the limit imposed by hand (10^''/i~^Mq). The total num- 
ber of detected clusters is about 6,800 for our adopted set of 
fiducial cosmological parameters (for reference, we note that 
the total number would be a factor of ~ 3 higher, ~ 20, 000, 
if we instead adopted the 1st year WMAP cosmological pa- 
rameters; the difference attributable mostly to the change 
in as from 0.90 to 0.76). 

In our fiducial calculation, we assume that temperature 
measurements are available for all clusters detected by the 
SZ survey. This is usually taken to require X-ray spectro- 
scopic data for each cluster. Mroczkowski et al. (2008) find 
that a joint analysis of SZ and X-ray imaging data, for 3 
clusters detected with the SZA instrument, yields tempera- 
ture profiles that are in good agreement with spectroscopic 
X-ray measurements. This may ease the requirement on the 
depth of the X-ray survey. The availability of temperatures 
for a large fraction of the SZ clusters may, however, still be 
an optimistic assumption, since the X-ray fiux drops much 
faster than SZ fiux at high redshift. We discuss the effect of 
partial followup in § O 



3 RESULTS 

With the cluster model and the Fisher matrix technique de- 
scribed above, we are now ready to forecast constraints from 
upcoming SZ and X-ray surveys. As an academic exercise, 
in ij IS.ll we first consider an "idealized case", in which cluster 
parameters are precisely known, and only cosmological pa- 
rameters are constrained. This exercise serves two purposes: 
(i) it allows us to understand where the cosmology sensi- 
tivity comes from, and (ii) it will clarify the amount of the 
degradation in the constraints, once the cluster model pa- 
rameter uncertainties are included. In § 13.21 we relax the 
assumption that cluster structure parameters are known, 
and simultaneously constrain cosmological and cluster pa- 
rameters. Finally, in § 13.31 we investigate the effects of the 
additional uncertainties in the scatter and incompleteness. 



3.1 Constraints with cosmological parameters 
alone 

Table [T] lists the marginalized la errors on cosmological 
parameters. In this and in the other tables below, "SR" 
stands for "scaling relations" , and "NC" stands for "number 
counts" . In addition to the marginalized errors (computed as 
a = {F~^)ii), in Table [1] we also list the single-parameter 
errors (Fa)^^^^ , and the degeneracy parameter T>, which 
we define as a /(Fa)'^^^ . In the limit of no degeneracies, 
D — > 1, while large D indicates significant degeneracy. 

As Table [T] shows, the scaling relation in general has a 
constraining power comparable to the number counts. For 
some of the cosmological parameters, and especially for wo 
and Wa, the SR is even more powerful than the NC ap- 
proach. This might be surprising, since the cluster abun- 
dance is known to be exponentially sensitive to cosmologi- 
cal parameters, while the SR depends on these parameters 
more-or-less "linearly". However, the scaling relation ap- 
proach has its own advantages. We compare these two ap- 
proaches in more detail in § U Let us first see where the 
constraints come from in the scaling relation approach. 

It is easy to see from equation (|16|l above that 

Y oc D^^^MgT ~ D:^\fgM,irT. (27) 

Since we are studying the Y — T relation, we eliminate the 
mass Mvir fi'om this equation by converting Mvir to T and 
Pvir using the virial theorem and mass conservation, 

Tvir oc Mvir/-Rvir (28) 
Afvir oc Pviri^vir- (29) 

Combining the above three equations, we find 

Y cc D-/f,p2'^T^'^. (30) 

Equation (|30p indicates that the dependence on cosmolog- 
ical parameters can arise through three terms: the angu- 
lar diameter distance Da, the gas fraction fg (defined in 
12.21 above), and the virial overdensity pvir. Here /g and 
Pvir are both related to cluster properties; Da, on the other 
hand, is a direct property of space-time geometry. Below, 
we first study the dependence oiY — T through the grouped 
combination of ]gP~lJ^ and through D~^ . This grouping 
is useful because Da is a pure geometrical quantity, and 
the cosmology-dependence that arises through this quantity 
is likely to be quite robust. On the other hand, predicting 
faPviJ^ requires a structure formation model, and the cos- 
mology dependence through this quantity will be necessarily 
model dependent. In particular, while pvir depends only on 
the details of nonlinear gravitational collapse, fg also de- 
pends on gas physics - in particular, in our case, on our as- 
sumption of hydrostatic equilibrium. For each cosmological 
parameter, we want to know whether these two dependen- 
cies work in the same direction, or whether they tend to 
cancel each other - and, in either case, it is useful to know 
which dependence dominates the constraints. 

To answer this question, we computed d\nY/dp sep- 
arately from each cosmology-dependent term, since the 
Fisher matrix element feg. llSf) is proportional to this deriva- 
tive. First, we allow cosmological parameters to vary when 

we compute Da, but artificially keep them at their fiducial 

r /2 

values in the computation of fg and p^J . The resulting 
dhiY/dp quantifies the dependence through D~^ alone. We 
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Table 1. Estimated la errors on the cosmological parameters in the academic case when cluster structure parameters are precisely 
known. Here "SR" and "NC" stand for "scaling relation" and "number counts" , respectively. The degeneracy parameter T) is defined as 
c / (Fii)~^^'^ , so that large values indicate significant degeneracy. 



Parameter constraints 




SR 






NC 






Combined 








(T 




V 


a 




V 


a 




V 






0.055 


0.00030 


183.4 


0.023 


0.0024 


9.6 


0.009 


0.00029 


29.6 


0.17 


^DE 


0.20 


0.028 


7.0 


0.29 


0.016 


18.4 


0.06 


0.014 


4.32 


0.13 




0.012 


0.00006 


189.7 


0.007 


0.00035 


20.2 


0.0037 


0.00006 


60.4 


0.36 


h 


0.08 


0.0014 


55.1 


0.11 


0.006 


18.6 


0.050 


0.0014 


36.4 


0.62 


wo 


0.037 


0.010 


3.6 


0.20 


0.018 


11.1 


0.016 


0.009 


1.8 


0.20 


Wa 


0.21 


0.044 


4.7 


1.4 


0.10 


14.4 


0.11 


0.040 


2.7 


0.29 


o-g 


N/A 


N/A 


N/A 


0.016 


0.0011 


14.2 


0.007 


0.0011 


6.2 


0.19 


lis 


N/A 


N/A 


N/A 


0.13 


0.023 


5.7 


0.036 


0.022 


1.6 


0.08 



t "Complementarity" parameter which quantifies the level of degeneracy breaking when different measurements are combined. See eq. 
1311 and Fang & Haiman (2008) for the formal definition, for details. 



next allow cosmological parameters vary in fg and p^ir i 
but we keep them fixed in Da; tliis yields the dependence 
through fgp~^J^ alone. The overall dependence d\nY /dp is 
simply the sum of these two. We compute the above deriva- 
tives at z = 0.2 and z = 1.5 and the mass is set to lO^^M© 
(we did not find a strong mass dependence in the deriva- 
tives, so these numbers are typical for clusters in the whole 
mass range of interest). 

The results of the above exercise are listed in Tables [2] 
andO As we can see from these two Tables, for each of the 
cosmological parameters, the derivative through -D^^ and 
faPvii have different signs (except for Q,b, to which Da has 
no sensitivity). This, unfortunately, means that the depen- 
dence from D^^ always cancels with the dependence from 

— 1/2 

fgPvir ■ ^^'^i' ^"i and ^Ib, the overall derivative is driven over- 

— 1/2 

whelmingly by fgp^^^ , and correspondingly the constraints 

— 1/2 

come through fgp^^J ■ For VIde and ft, there are significant 
cancellations, and the overall derivative has the same sign 
as that from D\, so the constraints come predominantly 
through DJ^^ . For vjq and Wa, there are again significant can- 
cellations, and the overall constraints come predominantly 
through fgp~iJ^ at low redshift, but through DJ,^ at high 
redshift. 

Although each parameter has a dependence through 
faPviJ^y the situation is different for (f2m, 0,^) and for 
(wo, Wa). f^m and Q,b are both directly related to fg, while 
they have a smaller or no effect on pvir- The gas fraction 
fg is roughly proportional to the global baryon fraction 
fb ~ flb/i^m- This direct dependence is the strongest among 
all dependencies of F — T on cosmological parameters. The 
dark energy equation-of-state parameters lOo and Wa, on 
the other hand, have no direct effect on fg, and they mainly 
come into play through pvir- A higher w induces a higher pvir, 
because clusters collapse earlier in such a universe (Kuhlen 
et al. 2005). Higher density means higher temperature for 
given cluster mass (eqs. and I29p . or conversely, lower 

^ Although we will keep referring to the combination fgP^^J'^ , 
it is useful to clarify that for w, the dependence through p^ir 
is always much stronger than the mild cosmology-dependence 
through fg arising from eq. (|9]l. On the other hand, we find that 
fg is much more sensitive to Vim and f2f, than Pvir- 



Table 2. dlnY/dp evaluated at a fixed cluster temperature at 
2 = 0.2, where p is any of the 6 cosmological parameters. The 

three columns, from left to right, show the values when we include 

— 2 — 1/2 

the dependence only via , or only via fgp^^^ , or the full 

derivative. 



Parameter ^a^ foPvii^"^ Overall 





+0.13 


-6.20 


-6.07 


^DE 


-0.19 


+0.15 


-0.03 


Ub 





+27.50 


+27.50 


h 


-1-2.59 


-1.34 


+1.25 


wo 


+0.19 


-0.38 


-0.19 


Wa 


+0.01 


-0.05 


-0.04 



Table 3. Same as Table [2] except for z = 1.5. 



Cosmological parameters 


^A 


t -1/2 


Overall 




+0.11 


-6.75 


-5.64 


^DE 


-0.76 


+0.68 


-0.08 


Ub 





+26.96 


+26.96 


h 


+2.59 


-1.37 


+1.23 


Wo 


+0.45 


-0.15 


+0.30 


Wa 


+0.11 


-0.07 


+0.03 



mass for a given temperature. As a result, Y is reduced for a 
given T (eq. I27|l . This reduction is further enhanced by the 
indirect effect of pvir on fg through the pressure boundary 
condition. This can be understood by recalling that the ICM 
is assumed to be confined by the external pressure of the in- 
falling gas. This pressure is proportional to QbT/Qrn (eqs. 1101 
and l28|) . When the virial density is raised, the external pres- 
sure increases roughly linearly with the temperature, which 
implies that the gas density is approximately kept constant 
(P oc pT) . The gas fraction, which is roughly proportional to 
the ratio of gas density to the virial overdensity, is therefore 
reduced. 

Among the cosmological parameters, the Y — T rela- 
tion is most sensitive to Qm and Qb, and constraints on 
these two parameters are therefore the tightest. However 
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they also suffer the most from the severe degeneracy be- 
tween one another (see column 2? in Table [1} . This is be- 
cause both parameters affect the Y ~ T relation in an ap- 
proximately uniform way, insensitive to redshift and cluster 
mass. This also means that a prior on one of these two pa- 
rameters from another measurement could greatly help con- 
strain the other. For example, if we apply a prior of 0.0015 
from the result of WMAP-f-BAO+SN on D,t (Dunkley et al. 
2008), AQrn is reduced by more than a factor of 3, while the 
constraints on other parameters change only mildly. From 
the simple argument that both flm and ^Ib are constrained 
through fg and fg is roughly proportional to the baryon frac- 
tion Qb/i^m, one could conclude that it is the combination 
Qb^^m^ that is being constrained. This conclusion is borne 
out by our numerical results, which show that the direction 
of the Fisher eigenvector in the Qb — i^m subspace is in the 
direction Qb^m'"'^ = const; i.e., the scaling relation indeed 
best constrains this combination. Other parameters have a 
comparatively smaller effect on the Y — T relation (see the 
"Overall" column in Tables [2] and |3] and the correspond- 
ing column {Fii)~^^^ in Table[TJ. But due to the sensitivity 
to each parameter having a different redshift dependence, 
they suffer much weaker degeneracies. The parameter with 
the lowest degeneracy (smallest C parameter) is wo, with 
jy— 3.6, less than l/50th of the degeneracy between Qrn 
and Qb- 

Analogous to the discussion above, previous works have 
clarified the cosmology-dependence of the cluster num- 
ber counts. We refer the reader to, e.g., Haiman et al. 
(2001) for a detailed discussion; here we just emphasize two 
points. First, the number counts also include a cosmology- 
dependence from the Y — M relation, through the selec- 
tion function g{Ya,M). The number-count constraint on 
Qb is driven through this dependence, but the constraints 
on other parameters are dominated by either the cosmolog- 
ical volume element or the growth function (except the Qrn 
dependence, which is dominated by the explicit linear scal- 
ing of the cluster mass function with flm)- Second, we not 
only have multiple redshift bins, but also multiple F-bins. 
This helps significantly in constraining cosmological param- 
eters (analogously to the shape of the cluster mass function 
being helpful; e.g., Hu 2003). In Table 3] below, we present 
constraints both with and without binning in Y. Comparing 
these two cases, we see that the T> parameter changes signif- 
icantly, while {Fii)~^^^ has only a mild change. This shows 
that the tightening of the constraints in the case when 8 Y 
bins are used is achieved mainly via breaking degeneracies 
between different parameters. In the same table, we also list 
constraints when the scatter between Y and Al is set to 
zero. The constraints become more stringent, which high- 
lights the degrading effect of the Y — M scatter through 
flattening the mass function. A small scatter in the mass- 
observable relation (10% as assumed in this work) is seen to 
degrade constraints by a factor of up to 4 (the parameter h). 
Again, we see that this is mainly due to higher degeneracies 
in the presence of the scatter. 

3.2 Constraints with cosmological and cluster 
parameters 

Below, we consider the more "realistic" case, in which we 
take into account uncertainties in cluster structure and its 



evolution. In § 12.21 we have parameterized cluster structure 
and evolution with 15 parameters, characterizing various as- 
pects of ICM physics, namely the shape of the gravitational 
potential, the gas entropy, non-thermal pressure, and bound- 
ary condition, as well as the mass- and redshift-dependence 
of these parametrized quantities. We repeat the above anal- 
ysis, but also including these cluster structure parameters, 
which means we constrain 23 parameters simultaneously. 
The results of this exercise are shown in Table [5] Within 
the parentheses next to the errors on the cosmological pa- 
rameters, we list the ratio of errors, TZ=a2/o'i, where cri is 
the idealized constraint shown in Table [T] and CT2 is the new 
value in Table [5] This ratio TZ therefore quantifies the degra- 
dation of the constraint introduced by the cluster parameter 
uncertainties. For the majority of parameters, the degrada- 
tion is less than a factor of 2, and the constraints remain 
tight, despite the large increase in parameter space (this is 
true for both the scaling relation and number counts ap- 
proaches). We emphasize again that we simplified things by 
assuming a particular form (eqs. Ilip for the mass and red- 
shift dependence of cluster parameters; nevertheless, these 
results highlight the ability of upcoming surveys to constrain 
a large number of parameters, which is due essentially to the 
large number of clusters and therefore small statistical er- 
rors. 

For the number counts approach, the largest degrada- 
tion is on Qb- This is understandable, because unlike for 
other cosmological parameters, the constraint on Qb, as 
we have mentioned, is largely from the Y — M relation 
in eq. (|21|l . not from the mass function itself, and all of 
the 15 cluster parameters affect the Y — M relation. This 
could account for the large degeneracy between Qb and the 
cluster parameters. Of course, Qb is measured accurately by 
other methods (Kirkman et al. 2003; Dunkley et al. 2008), 
and this degradation is not a concern. In the scaling rela- 
tions, however, the largest degradation is suffered by wq. 
This is because the simple power-law parameterization in 
eq. [TT] happens to be close to the way wo affects the evolu- 
tion of the scaling relation. This large degeneracy between 
the DE equation-of-state parameters and cluster parameter 
suggests that the cluster constraints on wo and Wa will be 
especially useful when combined with independent measure- 
ments of these parameters using other probes. 

Table Oshows further that the constraints on cluster pa- 
rameters are, in general, quite weak from both approaches, 
with most constraints at the order-unity level. This, con- 
versely, indicates that the Y parameter is relatively insen- 
sitive to cluster parameter variations. We see from Table [S] 
that the single-parameter errors {Fii)~^^^ for the cluster pa- 
rameters are very low, showing that the weakness of these 
constraints are due to strong degeneracies among the clus- 
ter model parameters. One likely reason for this strong de- 
generacy is that we adopted the same power-law form for 
the mass- and redshift -evolution for each of the cluster pa- 
rameters. Indeed, we find that introducing even unrealis- 
tically tight priors on the cosmological parameters do not 
improve cluster parameter constrains significantly, indicat- 
ing that the degeneracies are among the cluster parame- 
ters themselves. We thus conclude that Y — T relation by 
itself is not a good way of placing precise constraints on 
individual cluster parameters, unless the mass-dependence 
and redshift-evolution of the physical parameters can be 
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Table 4. Constraints from number counts approach with different scatter and different number of Y bins. Columns labeled with 
"Fiducial" contain results with 10% scatter and 8 Y bins; columns labeled with "1 Y bin" contain results with 10% scatter and 1 Y bin; 
columns labeled with "No Scatter" contain results with no scatter and 8 Y bins. 



Parameter constraints 




Fiducial 






1 Y bin 






No Scatter 






(T 




V 


cr 




V 


cr 




V 




0.023 


0.0024 


9.6 


0.10 


0.0026 


39.4 


0.018 


0.0022 


8.1 




0.29 


0.016 


18.4 


1.0 


0.016 


63.0 


0.19 


0.015 


12.5 




0.007 


0.00035 


20.2 


0.042 


0.00043 


99.2 


0.0019 


0.00030 


6.4 


h 


0.11 


0.006 


18.6 


0.54 


0.007 


75.9 


0.028 


0.0049 


5.6 


wo 


0.20 


0.018 


11.1 


0.42 


0.019 


22.5 


0.12 


0.017 


7.0 


Wa 


1.4 


0.10 


14.4 


8.1 


0.10 


80.3 


1.0 


0.10 


9.9 




0.016 


0.0011 


14.2 


0.07 


0.0011 


66.5 


0.011 


0.0011 


10.3 


ris 


0.13 


0.023 


5.7 


1.1 


0.037 


28.6 


0.06 


0.022 


2.9 



understood a-priori, and they diflFer significantly from the 
power-law forms assumed here. Of course, the F — T re- 
lation still delivers tight constraints on cluster-parameter 
combinations, so it should be useful when combined with 
other cluster observables. 

3.3 Effects of scatter and completeness 
uncertainties 

In addition to the uncertainties in cluster structure, there are 
also uncertainties in scatter and completeness. The scaling 
relation test is affected by scatter and incompleteness only 
indirectly through Malmquist bias. This bias is the increase 
in the mean value of Y at fixed T, because the lowest-K 
clusters that are scattered below the detection threshold are 
missing from the sample. We find that the scatter changes 
the number of detectable clusters in each redshift bin by 
less than 1% (except in the three bins beyond z — 1.8, 
where the changes are between 1-2%), and the mean (Y) 
is changed by a similar amount. This is comparable to the 
change in (Y) caused by variations in our parameters within 
their marginalized la uncertainties (see Tables [21 E] and [TJ . 
However, since flux limited surveys can do a correction that 
should eliminate the bulk of the Malmquist bias, we believe 
this will not be a major limitation of the constraints. 

The effects of scatter and completeness uncertainties on 
the number counts constraints is somewhat more subtle. In 
this section, we allow both the scatter and a completeness to 
vary, together with the cosmological and cluster parameters. 
The scatter between M and Y is parameterized using the 
same power-law form as the cluster parameters (eq. Ilip . 
The completeness C, defined as the fraction of clusters at a 
given Y at redshift z that are detected is taken to be given 
by a similar power-law, except M/M* is replaced by Y/Y* 
in equation where Y* is chosen to be ImJy. 

The fiducial scatter is assumed to be 10%, and the fidu- 
cial completeness is set to be 100% (both independent of 
redshift and mass, except completeness is set to zero below 
Mmi-a)- Table [6] shows the results when scatter is included 
and when both scatter and incompleteness are included. For 
comparison, we also list the result when neither uncertainty 
is taken into account (repeating the NC column from Ta- 
ble O . The degradations are relatively small for most pa- 
rameters. The two exception are fim and (Tg, for which the 
number-count constraints degrade by a factor of ~ 3 when 



the completeness uncertainty is included. The impact of the 
completeness uncertainty is relatively modest, because our 
treatment assumes that we know the form of the dependence 
of C on y and z (i.e. power-laws in our case). At the oppo- 
site extreme, if one allows completeness to be an arbitrary 
function of mass and redshift, then of course no constraint 
can be derived on any model parameter. The fact that we 
still find interesting constraints shows that a reliable pa- 
rameterization of the completeness as a function of Y and 
z will be very important. We also find that all of the con- 
straints would recover their values (to within 30%) in the 
fixed C =1 case when a prior of 15% is applied to C. Finally, 
in Table [6] we also list the combined constraints from the 
scaling relations and the number counts. Comparing these 
values with the combined constraints listed in Table O we 
find that these constraints are less affected by incomplete- 
ness than those from number counts approach alone. Except 
for (Tg, which degrades by a factor of ~ 2, the constraints all 
degrade by factors of ;^1.5. 

The physical origin of scatter in the observables Y and 
T can be cluster-to-cluster variations in the structure param- 
eters (even in the context of our idealized spherical models), 
and geometrical effects (i.e. viewing aspherical clusters along 
different sight-lines). As mentioned in 12.21 above, in addi- 
tion to the Malmquist bias, underlying scatter in a physical 
structure parameter can cause a bias by producing a skewed 
probability distribution for Y . In order to assess how large 
this additional bias might be, we have performed the follow- 
ing calculation. First, looking at the third column in TableO 
we see that the best constrained individual cluster parame- 
ter is Snorm- This suggcsts that among the cluster structure 
parameters, it is this parameter (the slope of the entropy 
profile) that could cause the largest bias. We then assumed 
a symmetric, Gaussian scatter on this parameter, with an 
r.m.s. equal to 10% of its fiducial value {(Js^arra ~ 0.11). We 
then computed the distribution of y-values, for clusters at 
z = 0.5, and T = 7keV, induced by this Gaussian scatte][f|. 
In the absence of any scatter in Snorm, the mean SZ decre- 
ment at r = 7keV is {Y) = 70.46mJy. When the scatter is 
included, we find that (Y) is increased by 2%, to 71.88 mjy. 



° Note that changes in Snorm change both Y and T; since we 
are interested in the distribution of Y at fixed T, we adjust the 
cluster mass M, for each value of Snorm, to keep T fixed at 7 keV. 
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Table 5. Constraints as in Table [T] except in the more realistic case that includes cluster structure parameters. Within the parentheses 
next to the errors on each cosmological parameter, we list the factor by which the constraints degrade relative to the idealized case. 



Parameter SR NC Combined 









cr 




cr 


(Fu)-^^^ 






0.087(1.6) 


0.00030 


0.068(2.9) 


0.0024 


0.038(4.2) 


0.00030 


0.50 


^DE 


0.28(1.4) 


0.029 


0.34(1.2) 


0.016 


0.15(2.5) 


0.014 


0.46 




0.080(6.7) 


0.000062 


0.10(14.3) 


0.00035 


0.038(10.3) 


0.000061 


0.38 


h 


0.12(1.5) 


0.0014 


0.14(1.3) 


0.0060 


0.075(1.5) 


0.0014 


0.70 


wo 


0.53(14.3) 


0.010 


0.34(1.7) 


0.018 


0.22(13.8) 


0.0089 


0.58 


Wa 


0.64(3.0) 


0.044 


1.63(1.2) 


0.099 


0.45(4.1) 


0.040 


0.58 


OS 


N/A 


N/A 


0.055(3.4) 


0.0011 


0.033(4.7) 


0.0011 


0.36 


ris 


N/A 


N/A 


0.87(6.5) 


0.023 


0.46(12.8) 


0.023 


0.28 


-^norm 


4.37 


0.0042 


2.25 


0.020 


0.56 


0.0041 


0.08 


Km 


0.67 


0.0014 


0.96 


0.010 


0.24 


0.0014 


0.19 




1.95 


0.0048 


2.03 


0.022 


0.79 


0.0047 


0.32 


^norm 


0.98 


0.00092 


2.40 


0.0097 


0.57 


0.00092 


0.40 




0.075 


0.00061 


1.75 


0.010 


0.042 


0.00061 


0.32 


Sz 


0.96 


0.0024 


3.56 


0.023 


0.48 


0.0024 


0.27 


^norm 


3.54 


0.0063 


2.60 


0.014 


1.02 


0.0057 


0.24 


bm 


1.92 


0.0050 


1.83 


0.015 


0.73 


0.0048 


0.30 


bz 


0.011 


0.0026 


0.26 


0.0093 


0.0083 


0.0025 


0.61 


NFW 
^norm 


2.02 


0.0030 


6.67 


0.076 


1.03 


0.0030 


0.29 


_NFW 


1.05 


0.0024 


6.59 


0.10 


0.49 


0.0024 


0.22 


NFW 


2.47 


0.0085 


5.74 


0.25 


1.33 


0.0085 


0.35 


^norm 


0.85 


0.0016 


0.45 


0.0050 


0.18 


0.0015 


0.20 


Vm 


0.97 


0.0015 


0.85 


0.0062 


0.24 


0.0015 


0.14 




1.09 


0.0049 


1.33 


0.013 


0.47 


0.0046 


0.31 



This level of bias is certainly a concern. From the pa- 
rameter constraints cr(p) hsted in Table [TJ and the logarith- 
mic dependencies of y vs p listed in Tables [2] and |3] we 
infer that systematic errors on the measurement of (Y) at 
fixed T should be controlled to within ~ 1%, in order for 
the corresponding bias on the parameters not to exceed the 
la constraints. Furthermore, in the above exercise, we find 
a standard deviation of ct{Y) = 14 mjy = 0.194(y); i.e. a 
value that is almost double the fiducial adopted 10% scatter 
in the Y — T relation. These numbers suggests that, in order 
to realize constraints comparable to the forecasts we present 
here, it will be necessary to obtain a physical understand- 
ing of the scatter in cluster-structure from hydrodynamical 
simulations. When the technique proposed here is applied 
to an actual large data-set, it will be important to include 
parameters that describe such scatter. The full distribution 
of Y at fixed T, rather than the single value (Y) , can then be 
used as additional signal, in order to constrain these extra 
parameters. 



4 DISCUSSION 

4.1 Comparing the Y — T and dN/dz Constraints 

We have forecast the constraints from Y — T scaling rela- 
tion approach and number counts approach. Interestingly, 
we find that these two approaches yield comparable results, 
even though the cluster abundance is more sensitive to cos- 
mological parameters than the scaling relation. There are 
several reasons, however, for the scaling relations to be com- 
petitive, at least statistically. First, the number counts only 
utilize Y and 2 of a cluster, while the scaling relation also 



derives information from the temperature T. Second, in the 
scaling relations, each cluster adds a new data point, so 
that we are effectively using A*' ~ 6, 800 independent mea- 
surements of the observable Y . Each F-measurement has 
a fractional error significantly below order unity (of order 
AY/Y — cty,t ~ 20% in eq. llSp . so that when these are 
combined independently, the effective combined statistical 
error is <jy,t/\/N- This is better than the total effective 
Poisson error (~ VN) from the cluster counts. This state- 
ment holds as long as Y is measured at fixed T with an 
uncertainty better than order unity, since effectively, in the 
number count approach, each cluster contributes an order- 
unity statistical error. This comparison neglects systematic 
errors, which can ultimately limit the constraints from both 
approaches. In particular, systematic errors in both the mea- 
surements of Y{T) and its model predictions have to be at 
the <l/v'iV « 1% level, in order not to dominate over the 
statistical errors; likewise, selection effects for the cluster 
counts have to be controlled to this level of systematic accu- 
racy. Third, unlike cluster counts, the scaling relation does 
not explicitly depend on erg and Us. The only dependence 
is through the Malmquist bias, but we find this dependence 
to be negligibly small, even when varying ag, in the range of 
0.7-0.9. Therefore, the scaling relation avoids degeneracies 
involving these parameters. Finally, as mentioned above, the 
number counts approach is less robust to selection errors. 

Quantitatively, the scaling relation approach yields 
tighter constraints on Q.de, h, wo and Wa when the un- 
certainty in cluster structure is ignored, while the number 
counts do better for the other parameters. Once we include 
cluster structure uncertainty, the constraint on wo from the 
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Table 6. Constraints from number counts approach with scatter 
and incompleteness taken into account. First column is the result 
when the uncertainties on scatter and completeness are ignored; 
second column is the result when the uncertainty on scatter is 
included; third column is the result when both uncertainties are 
taken into account; lastly we also list the combined constraints 
from number counts approach and scaling relation approach with 
scatter and incompleteness included. 



In the idealized case (Table [T}, this parameter is below 0.2 
for 5 of the 8 cosmological parameters, meaning that for 
these parameters, the combined constraints are more than 
a factor of two tighter than simply adding the two results 
in quadrature. This parameter is larger in the realistic case 
(Table [5]), with an average of ~ 0.48 for the cosmological 
parameters (the lowest value is 0.28, for Us). 



Parameter 


None 
NC 


+st 

NC 


+S+I1" 
NC 


+s+it 

Combined 






0.068 


0.072 


0.225 


0.060 


0.55 




0.34 


0.35 


0.66 


0.22 


0.72 




0.10 


0.12 


0.12 


0.053 


0.65 


h 


0.14 


0.17 


0.18 


0.081 


0.68 


Wo 


0.35 


0.46 


0.56 


0.27 


0.49 


Wa 


1.63 


1.66 


1.78 


0.48 


0.66 


a-8 


0.055 


0.061 


0.17 


0.072 


0.19 


ris 


0.87 


0.92 


1.08 


0.64 


0.35 


^norm 


2.25 


2.73 


2.78 


1.41 


0.36 




0.96 


1.24 


1.48 


0.26 


0.18 




2.03 


2.08 


2.12 


0.91 


0.40 


Snorm 


2.40 


2.90 


2.93 


0.68 


0.54 


Sm 


1.75 


1.88 


2.07 


0.045 


0.36 


Sz 


3.56 


3.77 


3.99 


0.52 


0.31 




2.60 


3.11 


3.16 


1.16 


0.24 


bm 


1.83 


2.17 


2.33 


0.86 


0.34 


K 


0.26 


0.44 


0.45 


0.0089 


0.70 


„NFW 
norm 


6.67 


7.25 


7.66 


1.45 


0.55 


^NFW 


6.59 


6.99 


7.42 


0.57 


0.30 


.NFW 


5.74 


6.08 


7.18 


1.38 


0.35 


^norm 


0.45 


0.64 


0.64 


0.26 


0.26 


rim 


0.85 


0.93 


0.96 


0.26 


0.14 




1.39 


1.38 


1.44 


0.48 


0.30 


C norm 




0.18 


0.18 


0.088 


0.23 


0"m 




7.60 


7.79 


3.77 


0.23 


(Tz 




8.64 


8.77 


5.75 


0.43 


Cnorm 






1.72 


0.56 


0.11 


Cm 






0.13 


0.088 


0.46 


Cz 






2.68 


1.14 


0.18 



t "S" stands for "Scatter", "I" stands for "Incompleteness". 



scaling relations and Qi, from the number counts suffer bad 
degradations. 



4.3 Beyond Power-Law Cluster Structure Models 

In Figure [1] we saw a hint that our power-law parameter- 
izations (in eq. [TT] might be insufficient over an extended 
meiss (and possibly also redshift) range. Since the degener- 
acy between cosmological and cluster parameters depends 
on this explicit form, we investigated the impact of allowing 
"curvature" in these power-laws. Specifically, we modified 
equation (|lip to include higher order terms, 

Inp = Inpnorm + Pml lu + Pm2 (in (32) 

+p,i In il + z)+ p^2 [In (1 + z)f . 

With the inclusion of these new terms, we have 5 x 5 = 25 
cluster model parameters. We use the Fisher matrix tech- 
nique to forecast constraints on these 25 parameters, to- 
gether with the cosmological parameters. Compared to the 
15 cluster parameter case (Table [S| , we find that the con- 
straints on flm, wo and Wa are the most affected, with an 
increase in their marginalized errors by a factor of ~ 2. From 
the large degradation factors (~ 10) shown in the parenthe- 
ses in Table [5] we see that Qt is sensitive to cluster structure 
uncertainties, as well. However, we find that the constraints 
on Q,DE and h are relatively robust to these uncertainties. 
The reason is that, as shown above, the constraints on these 
two parameters arise through Da, which is not affected by 
cluster parameters, while the constraints on all other pa- 
rameters receive a significant contribution from the cluster 
properties. We also find that if we simultaneously impose a 
prior of 50% on the fractional error on each of the 25 cluster 
parameters, this recovers constraints similar to that of 15 
cluster parameters case. This suggests that relatively weak 
priors may mitigate the impact of more complicated cluster 
structure models. 



4.2 Combined Y -T and dN/dz Constraints 

We also computed the combined constraints from the two 
methods. As stated above, we assume that the two con- 
straints are independent, and simply add the Fisher matri- 
ces; we justify this assumption in § 14.111 below. Combining 
the scaling relation with the number counts is useful to break 
degeneracies. For example, in the idealized case, the com- 
bined constraint on Us is a factor of 3 tighter than that from 
number counts alone. This improvement is entirely from de- 
generacy breaking, since the scaling relation approach does 
not directly constrain ris. While the constraints in the 
realistic case degrade to uninteresting levels, the degeneracy 
breaking is also helpful in constraining other parameters. To 
quantify this, we computed the "complementarity" param- 
eter, which defined as (Fang & Haiman 2008), 



C = (Ap 



combincd\ 2 



1 



(Ap=' 



(31) 



4.4 Uncertainties from the Low— Redshift 
Mass— Floor 

At low redshifts, the mass corresponding to a simple con- 
stant SZ flux is very low - dropping below masses corre- 
sponding to galaxy clusters. The nearby objects extend a 
large solid angle, so that surface brightness selection effects 
are no longer negligible. While this issue can be addressed 
by using different cluster-finding algorithms (e.g., Sehgal et 
al. 2007), for simplicity, we have imposed a constant mass 
limit of lO^*/i-^M0 To see whether the constraints are 
sensitive to the (somewhat arbitrary) choice of this mass 
fioor, we re-computed our constraints for a different value of 



^ The selection will also be affected by instrumental specifica- 
tions, such as the beam profile, which will have to be taken into 
account in an actual analysis. 
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5 X lO^^'^A^Q . Lowering the mass floor increases the total 
number of clusters by ~ 13% (i.e. by ~ 900 new clusters at 
redshift below ~ 0.2, where the mass floor is above the mass 
limit set by the flux threshold). The scaling relation results 
are not very sensitive to this change; the difference in the 
constraints is only a few percent in the idealized case, and 
up to 20% in the realistic case. The results, however, change 
more significantly in the number counts approach. In the ide- 
alized case, Om, ^DE, Wo and Wa are most affected - their 
constraints improve to 0.014, 0.080, 0.094 and 0.28 respec- 
tively (compared to the original results of 0.023, 0.29, 0.20 
and 1.4). These large improvements cannot be explained by 
the modest, ~ 13% increase in the number of clusters, im- 
plying that the low-redshift, low-mass clusters help break 
the significant degeneracies among these parameters. In the 
realistic case, lowering the mass fioor by a factor of two still 
improves the constraints on {flm, i^DE, wo, Wa) by factors 
of (1.4, 1.9, 1.6, 4.6). This underscores the importance of an 
accurate measurement of the abundance of low-mass clus- 
ters at low redshifts. It also shows that mis-estimates in the 
value of the mass fioor can potentially introduce a large bias; 
such mis-estimates are equivalent to errors in the selection 
function as discussed in 13.31 above. 



4.5 Uncertainties from Excising the Cluster Core 

A potential concern is that in our Y ~T scaling relation, we 
restricted the definition of the emission-weighted temper- 
ature outside the radius 0.15i?5oo (see eq. I17|l . While this 
eliminates the sensitivity of the temperature to the complex 
physics in the cluster core, it can still introduce uncertain- 
ties or a bias in the inferred Tew, since the inner radius has 
to be estimated from the data itself. We can ask the simple 
question: how accurately do we have to know -R500, in order 
for the corresponding bias not to exceed the la constraints 
we obtained above? As mentioned in § 13.31 above, the sys- 
tematic errors on the measurement of Y at fixed Tow should 
be controlled to within ~ 1%. Since Y cc (see eg. I30p . 
this implies that the systematics of the Tew measurement 
has to be accurate to ~ 0.4%. We emphasize that this is 
the required systematic error, and a much larger scatter on 
the measurements for individual clusters is tolerable (in our 
case, the individual temperatures need to be known to an 
accuracy of VMOO x 0.4% ^ 30%). We find that a 0.4% 
change in the temperature corresponds to a 5% change in 
the radius (i.e. in the lower limit of the integral in eq. 1171 
for reference, the difference between Reoo and -R500 is around 
8%). The cosmological dependence of iisoo is smaller than 
this: a la (=0.055) change in Qrn induces a 3.5% change in 
R500 , and other cosmological dependencies are much weaker. 
We conclude that the cosmology-dependence of R500 will 
not degrade the errors by more than la. However, the 5% 
error in the radius, translates into a 15% systematic error 
requirement on the enclosed mass. It should be possible to 
calibrate the Y — M relation to within 15% systematic ac- 
curacy (Kravtsov et al. 2006), especially if ~ 10% priors on 
ilm and Qb are used. 



4.6 Using Priors on Cluster Structure to Improve 
Constraints 

In the realistic case above, we have arguably been pessimistic 
by allowing all of the cluster parameters to vary arbitrarily. 
In reality, useful priors may be available on these param- 
eters, either from other observations, or from simulations. 
For example, detailed X-ray measurements already reveal 
ICM entropy profiles in low redshift clusters (Ponman et 
al. 2003, Pratt, et al. 2006). Likewise, the NFW concen- 
tration parameter has been carefully studied in numerical 
simulations (e.g., Navarro et al. 1997; Wechsler et al. 2002; 
Kuhlen et al. 2005). To assess whether such priors could 
improve constraints on cosmological parameters, in Table [7] 
we present the results of calculations that adopt a prior of 
1.0 and 0.1 on the fractional errors on all cluster parame- 
ters. Arguably, order-unity priors should be achievable, and 
in this sense, the results in Table [7] are even more "realis- 
tic" than those without priors. While improvements are no- 
ticeable (approaching factors of two for wq and Wa) for the 
SR-only constraints even with these weak priors, Table [7] 
shows that improving the combined SR-I-NC constraints by 
a factor of two requires ~ 10% priors. 

In order to assess whether a single cluster parameter 
is the "culprit" , we next tried applying priors on individual 
cluster parameters. We found no significant improvements, 
even if we applied priors of 0.001 on any one of the cluster 
parameters. This shows that the degeneracies are all inher- 
ently multi-dimensional. As an another series of exercises, 
we applied simultaneous priors either on the set of all nor- 
malization parameters (A'norm, Snorm, ^norm, Cnorm ; ^norm), 

the set of all mass-dependence parameters, or the set of 
all redshift -dependence parameters. We focus on how these 
priors affect SR constraints on wq, since this is the con- 
straint most affected by the cluster parameters. We find that 
the largest improvements are afforded by priors on redshift- 
dependence parameters. The constraint improves from 0.53 
to 0.25 when a prior of 0.1 is applied to all redshift depen- 
dence parameters. 



4.7 Beyond the Total SZ Decrement 

Although large upcoming SZ surveys will not produce re- 
solved images of a large fraction of the detected clusters, it 
should still be possible to go beyond measuring the overall 
SZ decrement, and to obtain at least a rough constraint on 
its profile. For example, for SPT and ACT, the expected an- 
gular resolution is 1', while clusters typically extend a few 
arc minutes. Given that our cluster model specifies the ra- 
dial structure of the cluster, it is interesting to ask whether 
this additional information might help further constrain pa- 
rameters. To see how large these improvements could be, 
we divided every cluster into two annular regions: an in- 
ner part with radius 2', and an outer part from r = 2' to 
the virial radius. We assumed the SZ survey could indepen- 
dently measure Y in both regions. The measurement errors 
are assumed to be proportional to the square root of their 
solid angles, and the sum in quadrature of the two errors is 
fixed to be 1 mjy (to be consistent with our preceding cal- 
culations, which adopted 1 mJy for the cluster as a whole). 
The Fisher matrix forecast that use both Y observables are 
given in Table |8] with and without cluster parameters in- 
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Table 7. Constraints from the scaling relations, using pessimistic or optimistic priors on cluster model parameters. Priors are applied 
simultaneously to all of the cluster parameters, but only the constraints on cosmological parameters are listed. 



Para- 




Prior of 1 






Prior of 0.1 




meter 


SR 


NC 


SR+NC 




SR 


NC 


SR+NC 






0.071 


0.053 


0.035 


0.70 


0.064 


0.029 


0.022 


0.66 




0.26 


0.31 


0.14 


0.52 


0.24 


0.30 


0.10 


0.30 




0.047 


0.047 


0.030 


0.80 


0.016 


0.012 


0.0077 


0.63 


h 


0.11 


0.12 


0.072 


0.78 


0.089 


0.11 


0.066 


0.90 


wo 


0.34 


0.28 


0.19 


0.77 


0.17 


0.21 


0.12 


0.85 


Wa 


0.46 


1.56 


0.37 


0.69 


0.34 


1.45 


0.21 


0.41 


OS 


N/A 


0.044 


0.032 


0.52 


N/A 


0.023 


0.017 


0.52 


ris 


N/A 


0.61 


0.40 


0.43 


N/A 


0.27 


0.22 


0.71 



Table 8. Constraints on cosmological parameters when the solid 
area of the cluster is divided into two annular regions, an inner 
part with angular radius 2' and an outer part outside radius 2'. 
The constraints assume the Y parameters in both regions are 
measurable independently. The columns TZ list the ratio of these 
constraints to their corresponding values from Tables [l] and [5] 
which use a single Y parameter. 



Parameter 6 parameters 21 parameters 





a 


11 


fj 


n 




0.010 


0.19 


0.050 


0.58 


^DE 


0.042 


0.21 


0.18 


0.65 




0.0040 


0.34 


0.036 


0.45 


h 


0.048 


0.63 


0.075 


0.65 


WQ 


0.013 


0.36 


0.21 


0.41 




0.12 


0.57 


0.30 


0.47 



eluded. The columns TZ list the ratio of the new constraints 
to their corresponding old values from Tables [l] and [S] As 
these ratios reveal, even the very crude measurement of the 
Y profile can improve the constraints by a factor of ~ 2. 

4.8 Spectroscopic Coverage 

A significant caveat, mentioned above, is that full spectro- 
scopic X-ray coverage of all SZ clusters is likely to be un- 
available, since the X-ray flux scales with redshift as D^^, 
rather than as SZ flux. For simplicity, we here study 
the effect of partial X-ray data, by discarding all clusters 
beyond z = 1 from the scaling relations. The results are pre- 
sented in Table O where again the columns TZ list the ratio 
of these new constraints to their old values in Tables [1] and 
[5l It turns out that the degradation is small, when cluster 
structure uncertainty is neglected - the largest difference is 
in Awo, whose error increases from 0.037 to 0.05. When the 
cluster structure parameters are included, the degradation 
is more severe (a factor of 1.7 for wq). Given that the num- 
ber of clusters beyond z = 1 is only about 5% of the total, 
and that they can bring down the constraints by a factor of 
~ 2, it could be worthwhile to conduct deep X-ray foUowup 
measurements of these ~ 300 clusters at 2; > 1. We note here 
for reference that the eROSITA deep survey is planned to 
cover 200 deg'^; acquiring ~ 30% temperatures out to z = 1, 
and/or to follow up on ~ 300 clusters at z > 1 will be chal- 



Table 9. Constraints from scaling relations up to ^ = 1 only. 



Parameter 6 parameters 21 parameters 





a 


TZ 


fj 


Tl 




0.066 


1.2 


0.20 


2.3 


f^DE 


0.23 


1.1 


0.32 


1.1 




0.014 


1.2 


0.21 


2.7 


h 


0.081 


1.1 


0.13 


1.1 


wo 


0.050 


1.4 


0.92 


1.7 


Wa 


0.23 


1.1 


1.11 


1.8 



lenging, but should be feasible in a large dedicated X-ray 
survey (e.g., Haiman et al. 2005). An alternative method 
may be to rely only on X-ray imaging data: Mroczkowski et 
al. (2008) find that a joint analysis of SZ and X-ray imaging 
data, for 3 clusters detected with the SZA instrument, yields 
temperature profiles that are in good agreement with spec- 
troscopic X-ray measurements. This may significantly ease 
the requirement on the depth of the X-ray survey. 

4.9 Impact of Flat Universe Prior 

We also note that most other forecasts in the literature tend 
to adopt a flat universe prior. When we impose this condi- 
tion (f2,n + ^DE = 1), we find that our conclusions do not 
change significantly. In the idealized case without cluster pa- 
rameters, we find that the NC-alone constraint on Awa is 
reduced from 1.4 to 0.47, and the SR-alone constraint on 
Auio is reduced from 0.037 to 0.012. These improvement, 
however are much less significant when cluster structure un- 
certainty is taken into account (the NC-alone constraint on 
Awo is reduced from 1.61 to 1.0, and the SR-alone constraint 
on Awa is reduced from 0.53 to 0.51). 

4.10 Utilizing CMB Constraints 

We also computed the joint constraints on cosmological pa- 
rameters from the scaling relations, number counts and from 
the CMB temperature and polarization anisotropy measure- 
ments by the upcoming Planck satellite. Table [10] presents 
these result. The Planck Fisher matrix is adopted from 
Heavens et al. (2007). This matrix assumes a flat universe, 
so we used the SR and NC matrices with the same assump- 
tion, and we placed a prior of 0.1 on all cluster parameters. 
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Table 10. Constraints including CMB anisotropy measurements 
by Planck. The constraints assume a flat universe, and a prior 
of 0.1 is applied to all cluster parameters. Only constraints on 
cosmological parameters are listed. 



Parameter 


SR 


NC 


Planck 


Combined 




0.059 


0.029 


0.0023 


0.0020 




0.015 


0.012 


0.00069 


0.00059 


h 


0.080 


0.11 


0.0049 


0.0042 


Wo 


0.14 


0.20 


0.35 


0.04 


Wa 


0.33 


0.52 


1.26 


0.12 


0-8 


N/A 


0.023 


0.074 


0.0050 




N/A 


0.26 


0.0033 


0.0022 



Unsurprisingly, the addition of the SR+NC data improve 
little over the Planck-a\one constraints for most cosmolog- 
ical parameters, except wo, Wa and as. As is well known, 
the Planck data alone for these 3 parameters suffer from 
severe degeneracies (their degeneracy parameters D, as de- 
fined in Table [1] are 161, 209 and 93 respectively). Adding 
the cluster data causes significant improvements in these pa- 
rameters. In fact, the improvements in the SIi+NC+ Planck 
combination are significant compared either to SR-fNC or 
to Planck alone; this shows that the improvements arise by 
breaking degeneracy between the Planck and cluster dataset. 
The joint constraints on wo and Wa improve to the interest- 
ingly tight levels of 0.04 and 0.12, respectively. 

4.11 Covariance Between Scaling Relations and 
Number Counts 

In our analysis, we have neglected correlations between 
the constraints derived from different observables, i.e., the 
Fisher matrices were simply added to obtain the joint con- 
straints. Apart from possible systematic measurement er- 
rors, the Y vs. T measurements of two different clusters, in 
different regions of the sky, should indeed be uncorrelated 
(except, perhaps in the rare cases of very close neighbors 
physically affecting each other). On the other hand, it is not 
obvious whether correlations between the SR and NC ap- 
proaches are negligible. For a given cluster, the Y — M and 
the Y — T relations could well be correlated, through the un- 
derlying physical origin of the scatter in these two relations. 
Those clusters with unusually high Y values would then also 
be more likely to be included in the detected sample. Indeed, 
in the limit that M and T are uniquely related without any 
scatter, deviations in Y from the expected (Y) will change 
the Y-T relation and can simultaneously affect the number 
counts (by moving a cluster to a different Y-hin). 

To see if this indeed introduces a significant correla- 
tion between the SR and NC constraints, we performed a 
suite of Monte Carlo calculations, in which 500,000 random 
realizations of a mock cluster catalog were generated. At 
the assumed redshift z = 0.5, each mock catalog contains 
~ 2500 clusters, drawn randomly from the underlying mass 
function above the mass limit of 10" Mq . This limit corre- 
sponds to an SZ signal of la at z = 0.5, sufficiently below the 
5(7 detection threshold that clusters with masses below this 
limit have negligible chance to scatter above the detection 
threshold. The total number of clusters in each realization 



of the mock catalog is drawn from a Poisson distribution 
with a mean of 2,500. The number of detected clusters is 
then ~ 400, which is roughly the number of clusters in the 
redshift bin 0.45 < z < 0.5 in our fiducial model, assuming 
the survey parameters given in § 12.51 

Using our cluster model, we assign a temperature and 
a Y parameter to each cluster, based on its mass. A 10% 
intrinsic scatter and a Icr measurement uncertainty, which 
are drawn independently from Gaussian distributions, are 
also added to the Y parameter, but no scatter is added to 
the temperature. By assigning scatter only to the Y param- 
eter (and not to T), the Y — M and Y — T relations are 
fully correlated, so that any resulting correlation between 
the SR and NC constraints will be overestimated. Maxi- 
mum correlation between the two approaches is achieved by 
anti-correlating Y — M and Y — T relations, however, such 
anti-correlation does not seem physically realistic. We have 
assigned a Fisher matrix to each single cluster in the SR 
approach, and these Fisher matrices might have different 
correlation strength to the NC approach. For example, scal- 
ing relation Fisher matrix of a cluster with Y just above the 
detection threshold seems more correlated to NC approach 
than average, because scatter of Y can easily make it unde- 
tected and therefore change the number of detected clusters. 
So to be exact, we need consider correlation for each single 
SR Fisher matrix, which is a daunting task. Instead, In or- 
der to reduce the dimensions of the covariance matrix, we 
binned the SR observables into the same Y bins used for 
the number counts (this does not significantly change the 
SC constraints since the Y — T relation does not have small- 
scale features that would be missed by this binning). We 
then studied the correlation between cluster number and 
the binned SR observable S = (F/Fo(r))bin, where Yo{T) is 
the Y parameter computed from the cluster model (without 
scatter), and the subscript "bin" indicates that the averag- 
ing is over all clusters in a y bin (as opposed to over Monte 
Carlo realizations). 

As a result of the intrinsic scatter and the measure- 
ment uncertainty, S generally deviates from unity. Using 
the 500,000 Monte Carlo realizations, we computed the fol- 
lowing three types of correlation coefficients: 

Cov(iV„iV,) 

rNiN, — , 

apf^aNj 

_ Cov(iV.,gj) 

'''NiSi — 1 

apfiagj 
_ Cov(5„gj) 

where i and j refer to the Y-bin indices and, e.g., 
Cov(A^,,A^j) = {{N^ - N,){Nj - iVJ))^''^ where bar denotes 
averaging within the Y-bin, and the other correlation coeffi- 
cients are defined similarly. First, without any scatter on Y, 
we checked that no correlation is introduced artificially in 
the treatment outlined above (i.e. all coefficients r are zero). 
When we include the scatter on Y, we find all three types of 
correlations coefficient are a few x 10~^ , which is still con- 
sistent with zero within the uncertainty of the calculation. 

While this result may be surprising, the lack of corre- 
lations is explained by the fact that the F-values are still 
drawn independently for each cluster, even after the scatter 
is included. In particular, moving clusters across adjacent 
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y-bins clearly introduces correlations in the number of ex- 
cess clusters relative to the no-scatter case. The inclusion of 
scatter, however, also changes the mean number of clusters 
in each Y-hin, and the Y/T ratio in these bins. It is the 
excursions around these modified mean values that we find 
to be essentially uncorrelated. In the Appendix, we use a 
simplified toy model, in which we show that the covariance 
between neighboring bins Cov{Ni, Nj) is strictly zero; one 
can similarly show Cov(A''i, Sj) = and Cov(Si, Sj) = 0. 

4.12 The Importance of the Virial Overdensity 

Finally, as mentioned above, there is an apparent tension 
between our results, and the conclusions reached recently by 
Aghanim et al. (2008), who find that the scaling relations 
are little affected by changes in the dark-energy equation of 
state. However, we believe these two results are not, in fact, 
in contradiction. We find that the dark energy parameters 
Wo and Wa indeed have only a small direct effect on the 
scaling relation. For instance, we find that the value of Y 
for a cluster with a fixed temperature, at 2 = 0.2, increases 
by about 4% when changing wo = — f to wo — —1.2 (see 
Table [5}. This is consistent with Figure 4 in Aghanim et al., 
which shows that the normalization of the Y — T relation 
changes by a few percent over the —1.2 < ijjo < —0.8 range, 
with the largest increase for the smallest value wo = —1.2 
consistent with 4%. We note that Aghanim et al. also include 
a cosmology-dependent factor, which is a power-law in the 
normalized Hubble parameter E{z), in their definition of 
Y . At low redshift, this factor is ~ 1, and does not drive 
the tiJo-sensitivity. At higher redshifts, this factor drives the 
uiQ-sensitivity; if we were to scale out this factor from the 
definition of Y, ai z = 1.5 we would predict a ~ 3% increase 
in Y when changing wq = —1 to wo = —1.2 which is again 
consistent with the change seen in Figure 4 in Aghanim et 
al. (their middle panel). The constraints we obtain here on 
Wo and Wa derive in large part from this modest dependence 
of the scaling relations on the dark energy equation of state, 
and owe much to the large number of clusters and relatively 
weak degeneracies. Our results are also explicitly based on 
the simulations by Kuhlen et al. (2005) on how dark energy 
affect the virial overdensity. Although there is no clear sign 
of any disagreement, it would be worthwhile to explicitly 
check the consistency between the wo-dependence of the 
virial overdensity between the two simulations. 



5 CONCLUSIONS 

In this paper, we studied the utility of the scaling relation be- 
tween the Sunyaev-Zeldovich decrement Y and temperature 
T of galaxy clusters; in particular, the constraint that this 
relation may place on cosmological and on cluster structural 
parameters. A phenomenological cluster model is adopted, 
which has 15 free parameters to describe cluster structure, 
and its dependence on mass and redshift. We demonstrated 
that this model fits available cluster observations, including 
the temperature profile outside the core. We then used this 
model to forecast constraints that could become available 
from a future survey, containing several thousand clusters. 

Our basic result is that the scaling relations have a sta- 
tistical constraining power on cosmological parameters com- 



parable to those from cluster number counts, even after we 
marginalize over the cluster parameter uncertainties. We in- 
vestigated where the cosmology sensitivity in the scaling re- 
lation comes from, and found that the constraints are driven 
by different physics for different parameters. The constraints 
on Qrn and Qb arise mainly through the gas fraction fg; the 
constraints on f2_D_B and h are predominantly through Da, 
and the constraints on wo and Wa are driven by the charac- 
teristic virial overdensity Avir at low redshifts, but by Da 
at high redshifts. 

The scaling relation constraints have significant degen- 
eracies. The most significant of these is between Qt and Qm, 
whereas the parameters suffering the least degeneracy are 
lOo and Wa. These dark energy equation-of-state parame- 
ters have statistical errors from the scaling relations that are 
somewhat tighter than from the number counts. Combining 
the scaling relation with the number counts, including multi- 
ple y-bins, and combining cluster data with expected CMB 
temperature and polarization measurements by Planck all 
help in breaking parameter degeneracies. In a model that 
uses 6,800 clusters, and combines the SR-I-NC-I- PZancfc data, 
and assumes a prior of 10% on the 15 cluster model param- 
eters, we find tight constraints on the dark energy equation 
of state parameters, Awo = 0.04 and Awa = 0.12. 

The most significant caveat to our conclusions is that we 
assumed a particular parameterization of the cluster model. 
Strictly speaking, our numerical results are valid only within 
the confines of this specific model. However, our results do 
indicate, more generally, that there will be significant sta- 
tistical constraining power in the scaling relations data, and 
that there is also sufficient cosmological sensitivity in the 
scaling relations to be relevant as a cosmology probe. The 
latter point is the main new conclusion of this work. Indeed, 
the cosmology-dependencies that we find drives the SR con- 
straints should be relatively robust features: Da, Avir, and 
/g are likely to depend on cosmology, in any reasonable clus- 
ter model, in the way we exploited here. Indeed, the likely 
outcome of the analysis of any actual cluster survey data 
is that it will force us to adopt some other, physically mo- 
tivated parametric description of cluster structure, and the 
parameters of that new model will then have to be con- 
strained, together with the cosmological parameters. 

We have studied various other explicit caveats, such as 
the impact of partial X-ray coverage of the SZ sample, or 
uncertain scatter and completeness, or departures from sim- 
ple power-laws in the dependence of the cluster parameters 
on mass and redshift. We generally find that the constrain 
loosen, as expected, but remain appreciable. Overall, our 
work suggests that explicit scaling relations do not, by them- 
selves, strongly constrain the highly degenerate cluster pa- 
rameters, but that they should be a useful component in 
extracting cosmological information from large future clus- 
ter surveys. 
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APPENDIX A: SCATTER AND THE (LACK 
OF) CORRELATIONS 



As mentioned in ij I4.11l in the text, in the limit that M and 
T are uniquely related without any scatter, deviations in Y 
around its mean value (Y) can simultaneously change the Y- 
T relation and affect the number counts (by moving a cluster 
to a different V-bin). It is not a-priori obvious whether this 
introduces correlations between the number counts in dif- 
ferent Y bins, and/or cross-correlations between the scaling 
relations and the number counts. However, in § 14.111 we 
performed Monte Carlo simulations and found that that the 
correlations and cross-correlations are both negligibly small. 
In this Appendix, we illustrate how this (lack of) correlation 
can arise, using a simplified toy model. 

Consider an experiment in which there are two bins, 
initially with a random number ni and n2 of objects in each 
bin, drawn independently from Poisson distributions with 
means of ni and n2. Subsequently, each object in bin #1 is 
either relocated to bin ^2 with a probability p, or left in bin 
#1, with a probability of (1 — p), after which there are A'^i 
and N2 objects in the two bins. To simplify the mathematics 
below, we do not consider relocating objects from bin #2 to 
bin #1; this does not affect the essence of the argument. 
The covariance between A''i and can be written as 

Cov(iVi,iV2) = Cov(iVi,iV2 -na +n2) (Al) 

= Cov(7Vi,Ari + 7i2) 

= Cov(7Vi, An) + Cov(Afi,n2), 

= Cov(7Vi, An) + Cov(ni - An,n2), 

= Cov(iVi, An) - Cov(An,n2), 

where An = Ai'2 — n2 = ni — A''i is the number of objects 
relocated from bin ^1 to bin #2, and in the last step, we 
have used Cov(ni,n2) = 0, which is true by construction. 

In order to compute the last two terms on the right 
hand side, we first need to know the probability distribution 
of An, 

P{An) = J2 P{^n\ni)P{ni). (A2) 

ni — An 

By definition, ni is drawn from Poisson distribution, 

— ni — ni 

P{ni) = Pois{ni-ni) = ■ , (A3) 

ni! 

and, given ni. An follows a binomial distribution, 



P(An|ni) = Bino{An; ni,p) 



(A4) 



nil 



An!(ni - An! 



An / 1 \n-i — An 

p (i-p) 



Substituting equations I A3I and I A4I to equation IA21 we find 
P(An) = (A5) 

ni"^e ni! __An/^ \ni—An 



E 

ni — An 

E 



ni! 



An!(ni - An! 



An / 1 \r, 



An!(ni - An!; 



ni — An 



Substituting ni = A''i + An, and moving outside the summa- 
tion all factors not containing A'^i, equation I A5I is simplified 
to 



P(An) = 



An! 



E 



(ni(l-p))^i 
iVi! 



(A6) 



Equation IA6I can be simplified further by noting that the 
summation on the right hand side is the Taylor expansion 
of e"''-'^^~^\ After this simplification, we obtain the final ex- 
pression of P(An), 



P(An) = 



(pni 



An! 



= Pois{An; pm) 



(A7) 



This corresponds to the intuitive result that An follows a 
Poisson distribution with a mean of pni. It can be shown 
similarly that P(iVi) = Pois{Ni-{l - p)ni) and P(iV2) = 
Pois{N2\n2 -\-pn\). Coy{N\, An) can now be computed by 
considering the variance of ni, 

Var(ni) = Var(7Vi + An) (A8) 
= Var(7Vi) + Var(An) + 2Cov(iVi, An). 

Since ni, A'^i and An all follow Poisson distributions, their 
variances are equal to their expectation values, which are 
ni,(l — p)ni andpni, respectively. With this information, it 
follows from equations IA2I and lAlOl that 



Cov(7Vi, An) = 0. 



(A9) 



Similarly, Cov(An, na) = follows by considering the vari- 
ance 

Var(Af2) = Var(n2-HAn) (AlO) 
= Var(n2) + Var(An) + 2Cov(An, na), 

and noting Var(A'^2) = n2 4- pni, Var(n2) = n2, 
and Var(An) = pn\. This concludes the proof that 
Cov(iVi,iV2) = Cov(iVi, An) + Cov(An,n2) = 0. 

The derivation above can be generalized to show 
Cov(7Vi, Nj ) =0 when i ^ j m the case of multiple Y bins. 
Similar to equation lAll Cov{Ni, Nj) could be decomposed 
as 



Cov(7V,,7Vj) = Cov(^ An,_,,^ An„_^j 

I m 

= Cov(Ani^,, Anm~*j), 



(All) 



where n;^; is the number of clusters relocated from bin #Z 
to bin #i by the scatter. For I 7^ m, Cov(An;_i, Arim-^j) = 
because of the independence between n; and Um, the 
number counts in the absence of scatter. To compute 
Cov(Ani^i, Anj^j), we first need the probability distribu- 
tions of An;_,i, An;_,j and Ani^i^j = {Ani^i + Ani^j). 
Following a procedure similar to the case of two bins above, 
we can show that these quantities all follow Poisson dis- 
tributions with expectation values of pi->ini, pi^jUi and 
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Pi^i,j = {pi^i+pi^j)ni, respectively. Here, pi^i is the prob- 
ability of a cluster in bin #Z being scattered into bin For 
the Gaussian scatter assumed in this paper, 

rAW(y,)^ ' (A12) 

JM,-„i„(yj) dM"-^" 

where Mmin(y!) and Minax(y;) are the minimum and maxi- 
mum masses in bin #Z in the absence of scatter, and g{Yi, M) 
was given in equation (f2T|l . By considering Var(An;^ij), 
one can then show Cov(Ani^i, Anj^j) = 0; substituting 
back into equation (|A11|) . we find Cav{Ni,Nj) — 0. 



